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Introduction 


This project to document the Playstation stated about a year ago. It started with the utter disgust I had for 
Sony of America after suing Bleem over the PSX emulation technology. I saw the ugliness of a huge multinational 
company try to destroy two guys who had a good idea and even tried to share it with them. It made me sick. I wanted 
to do something to help, but alas I had no money, (I still don't) but I did buy a Bleem CD to support them. 

I decided to start this little project. Partially to prove to Sony, but mostly to prove to myself, that coming up 
with the data to create you own emulator was not that hard. I also wanted to show that behind that gray box that so 
many people hold dear. It's just a computer with no keyboard, that plugs into your TV. It's one thing to think that you 
were spending $250 on a new PSX, but it's another to realize that the CPU costs $5.99 from LSI. 

Kind of puts thing into perspective, doesn't it. 

I'm not a programmer. I've never worked for sony, and I never signed a Non-Disclosure Agreement with 
them. I just took my PSX apart, found out what made it tick, and put it back together. I also scoured the web looking 
for material that I could find. I never looked at any of Sony's official documentation and never took any thing you 
would have to have a license to get. Such as PSY-Q. I mostly poked at emulators to see how they worked. Bleem was 
only 512k at the time and was pretty easy to see how it functioned without even running it through a dissembler. 
PSEmu had an awesome debugger so I can see how a PSX ran even without caelta. 

I want this documentation to be freely available. Anyone can use it. From the seasoned PSX programmer to 
the lurking programmer read to make the next big emulator. If there is a discrepancy in my doc, please fix it. Tear 
out parts that are wrong and correct it so it's better that what I have now. I wanted to shoot for a 7596 accuracy rating. 
I think I got it, but I don't know. Most of the stuff in here is hearsay and logical deductions. Much of it is merely a 
guess. 

Of course there is the standard disclaimer, all trademarks are of the appropriate owners and that this 
documentation is not endorsed by Sony or Bleem in any way. You are, once again, free to give this away, trade it, or 
do what you will. It's not mine anymore. It's everybody's. Do with it what you please. Oh, and if your PSX blows up 
or melts down due to this documentation, sorry. I can't assure the validity of *any* info other that I didn't get it from 
Sony's official documentation. I'm not responsible to what you do to your machine. 

In closing I wish to apologize for the way this introduction was written as it's 2:00 in the morning. I have a 
wedding to get to at 10:30 and I've been up for the last three days finishing the darn thing. I wish to thank everyone 
who supported me. Janice, for believing in me and My girlfriend Kim who put it with the long nights in front of the 
computer writing and the long days in fornt of the Playstation claming I was "doing research" while playing FF8. I 
can't think of anything more to say. Have fun with this 


-Joshua Walker 


4/29/2000 
2:34am 
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History 
Prologue B.P. (Before PlayStation) 


Before the release of the PlayStation, Sony had never held a large portion of the videogames market. It had 
made a few forays into the computer side of things, most notably in its involvement with the failed MSX chip in the 
early 80's, but it wasn't until the advent of CD-ROM technology that Sony could claim any market share. A joint 
venture with the Dutch company Philips had yielded the CD-ROM/XA, an extension of the CD-ROM format that 
combined compressed audio, and visual and computer data and allowed both to be accessed simultaneously with the 
aid of extra hardware. By the late 80's, CD-ROM technology was being assimilated, albeit slowly, into the home 
computer market, and Sony was right there along side it. But they wanted a bigger piece of the pie. 


1988 Sony Enters The Arena 


By 1988, the gaming world was firmly gripped in Nintendo's 8-bit fist. Sega had yet to make a proper 
showing, and Sony, although hungry for some action, hadn't made any moves of its own. 

Yet. 

Sony's first foray into the gaming market came in 1988, when it embarked on a deal with Nintendo to 
develop a CD-ROM drive for the Super NES, not scheduled to be released for another 18 months. This was Sony's 
chance to finally get involved in the home videogame market. What better way to enter that arena than on the coat- 
tails of the world's biggest gaming company? 

Using the same Super Disc technology as the proposed SNES drive, Sony began development on what was 
to eventually become the PlayStaion. Initially called the Super Disc, it was supposed to be able to play both SNES 
cartridges and CD-ROMs, of which Sony was to be the "sole worldwide licenser," as stated in the contract. Nintendo 
was now to be at the mercy of Sony, who could manufacture their own CDs, play SNES carts, and play Sony CDs. 
Needless to say, Nintendo began to get worried. 


1991 Multimedia Comes Home 


1991 saw the commercial release of the multimedia machine in the form of Philips' CD-I, which had 
initially been developed jointly by both Philips and Sony until mounting conflicts resulted in a parting of ways. 
Multimedia, with the current rise of the CD-ROM, was seen as the next big thing. And although the CD-I was too 
expensive for the mass market, its arrival cemented the CD-ROM as a medium for entertainment beyond the 
computer. 


June 1991 Treachery At The 11th Hour 


I n June of 1991, at the Chicago CES (Consumer Electronics Show), Sony officially announced the Play 
Station (space intentional). The Play Station would have a port to play Super Nintendo cartridges, as well as a CD- 
ROM drive that would play Sony Super Discs. The machine would be able to play videogames as well as other forms 
of interactive entertainment, as was considered important at the time. 

Sony intended to draw on its family of companies, including Sony Music and Columbia Pictures, to develop 
software. Olaf Olafsson, then chief of Sony Electronic Publishing, was seen on the set of Hook, Steven Spielberg's 
new Peter Pan movie, presumably deciding how the movie could be worked into a game for the fledgling Play 
Station. In Fortune magazine, Olafsson was quoted as saying "The video-game business... will be much more 
interesting (than when it was cartridge based). By owning a studio, we can get involved right from the beginning, 
during the writing of the movie." 

By this point, Nintendo had had just about all it could take. On top of the deal signed in 1988, Sony had 
also contributed the main audio chip to the cartridge-based Super NES. 

The Ken Kutaragi-designed chip was a key element to the system, but was designed in such a way as to 
make effective development possible only with Sony's expensive development tools. Sony had also retained all rights 
to the chip, which further exaserbated Nintendo. 


The day after Sony announced its plans to begin work on the Play Station, Nintendo made an announcement 
of its own. Instead of confirming its alliance with Sony, as everyone expected, Nintendo announced it was working 
with Philips, Sony's longtime rivals, on the SNES CD-ROM drive. Sony was understandably furious. 

Because of their contract-breaking actions, Nintendo not only faced legal repercussions from Sony, but 
could also experience a serious backlash from the Japanese business community. Nintendo had broken the unwritten 
law that a company shouldn't turn against a reigning Japanese company in favor of a foreign one. 

However, Nintendo managed to escape without a penalty. Because of their mutual involvement, it would be 
in the best interests of both companies to maintain friendly relations. Sony, after all, was planning a port for SNES 
carts, and Nintendo was still using the Sony audio chip. 


1992 The Smoke Clears 


By the end of 1992, most of the storm had blown over. Despite a deal penned between Sega, one of 
Nintendo's biggest competitors, and Sony, whereby Sony would produce software for the proposed Sega Multimedia 
Entertainment System, negotiations were reached with Nintendo. In October of 1992, it was announced that the two 
companies' CD-ROM players would be compatible. The software licensing for the proposed 32-bit machines was 
awarded to Nintendo, with Sony receiving only minimal licensing royalties. Nintendo had won this battle, but hadn't 
won the war. Not by a long shot. 

The first Play Station never made it out of the factories. Apparently, about 200 were produced, and some 
software even made it to development. For whatever reason, whether it was because of the tough licensing deal with 
Nintendo, or the predicted passing of masked ROM (cartridge-based) technology, Sony scrapped its prototype. Steve 
Race, Sony Computer Entertainment Of America's (SCEA) then CEO, stated, "Since the deal with Nintendo didn't 
come to fruition we decided to put games on a back burner and wait for the next category. Generally, the gaming 
industry has a seven-year product life-cycle, so we bided our time until we could get in on the next cycle." 


1993 The Next Cycle 


After returning to the drawing boards, Sony revealed the PS-X, or PlayStation-X. Gone was the original 
cartridge port, as were the plans for multimedia. Apparently, Sony had visited 3DO when Trip Hawkins was selling 
his hardware and had come away unimpressed, saying it was "nothing new." The PS-X was to be a dedicated game- 
machine, pure and simple. Steve Race said in Next Generation magazine, "We designed the PlayStation to be the 
best game player we could possibly make. Games really are multimedia, no matter what we want to call it. The 
conclusion is that the PlayStation is a multimedia machine that is positioned as the ultimate game player." 

Key to Sony's battle plan was the implementation of 3D into its graphics capabilities, a move that many felt 
was critical to any kind of future success. At the core of the PlayStation's 3D prowess was the R3000 processor, 
operating at 33 Mhz and 30 MIPS (millions of instructions per second). While this may seem fairly average for a 
RISC CPU, it's the PlayStation's supplementary custom hardware, designed by Ken Kutaragi (who had previously 
designed the key audio chip for the SNES), that provides the real power. The CPU relies heavily on Kutaragi's VLSI 
(very large scale integration) chip to provide the speed necessary to process complex 3D graphics quickly. 

The CPU is backed up by the GPU (Graphics Processing Unit), which takes care of all the data from the 
CPU and passes the results to the 1024K of dual-ported VRAM, which stores the current frame buffer and allows the 
picture to be displayed on-screen. Part of this picture involves adding special effects such as transparency and fog, 
something that the PlayStation excels at. This was to be the most impressive display of hardware the home gaming 
world had ever seen 


1994 Third Party Round Up 


There was no doubt that Sony could deliver the hardware. After all, Sony was one of the world's largest 
manufacturers of electronics. There was no denying though, that Sony was extremely green when it came to 
videogames. And no one knew it better than Sony. 

Not wanting to end up like Atari or 3DO, Sony set about rounding up third party developers, assembling an 
impressive 250 developing parties in Japan alone. Sony also knew it had to gain the support of the millions of 


arcade-going gamers if it was to succeed. The involvement of Namco, Konami, and Williams helped ensure Sony 
would be able to compete with the arcade-savvy Sega on its own turf. Namco's Ridge Racer was the natural choice to 
be the flagship launch game, and Williams' Mortal Kombat III, previously promised to Nintendo for their Ultra 64, 
could be tested in the arcades using the new PS-X board. 

Perhaps Sony's most controversial acquisition was the purchase of Psygnosis, a relatively unknown 
European developer, for $48 million. Sony needed a strong in-house development team, and Psygnosis' Lemmings 
seemed to point at good things. While the purchase confused many at the time, prompting vocal outcries from 
naysayers and competitors alike, Psygnosis has since proven them all wrong. Sony Interactive Entertainment, as 
Psygnosis was renamed, has been responsible for some of the PlayStation's best games, including WipeOut and 
Destruction Derby. 

Sony's acquisition of Psygnosis yielded another fruit as well: the development system. SN Systems, co- 
owned by Andy Beveridge and Martin Day, had been publishing its development software through Psygnosis under 
the PSY-Q moniker. Sony originally had been planning on using expensive, Japanese MIPS R4000-based machines 
that would be connected to the prototype PS-X box. Having become accustomed to developing on the PC, Psygnosis 
gave Beveridge and Day first crack at creating a PlayStation development system that would work with a standard 
PC. 

The two men worked through Christmas and New Year's, around the clock, eventually completing the 
GNU-C compiler and the source-level debugger. Psygnosis quickly arranged a meeting for SN and Sony at the 
Winter CES in Las Vegas, 1994. Fortunately, Sony liked the PSY-Q alternative and decided to work with SN 
Systems on cendensing the software onto two PC-compatible cards. Thus, an afordable and, more importantly, 
universally compatible PlayStation development station was born. 


December 3, 1994 We Have Lift Off 


On December 3, 1994, the PlayStation was finally released in Japan, one week after the Sega Saturn. The 
initial retail cost was 37,000 yen, or about $387. Software available at launch included King's Field, Crime Crackers, 
and Namco's Ridge Racer, the PlayStation's first certifiable killer app. It was met with long lines across Japan, and 
was hailed by Sony as their most important product since the WalkMan in the late 1970's. 

Also available at launch were a host of peripherals, including: a memory card to save high scores and 
games; a link cable, whereby you could connect two PlayStations and two TVs and play against a friend; a mouse 
with pad for PC ports; an RFU Adaptor; an S-Video Adaptor; and a Multitap Unit. Third party peripherals were also 
available, including Namco's Negcon. 

The look of the PlayStation was dramatically different than the Saturn, which was beige (in Japan), bulky, 
and somewhat clumsy looking. In contrast, the PlayStation was slim, sleek, and gray, with a revolutionary controller 
that was years ahead of the Saturn's SNES-like pad. The new PSX joypad provided unheardof control by adding two 
more buttons on the shoulder, making a total of eight buttons. The two extended grips also added a new element of 
control. Ken Kutaragi realized the importance of control when dealing with 3 Dimensional game worlds. "We 
probably spent as much time on the joypad's development as the body of the machine. Sony's boss showed special 
interest in achieving the final version so it has his seal of approval." To Sony's delight, the PlayStation sold more 
than 300,000 units in the first 30 days. The Saturn claimed to have sold 400,000, but research has shown that number 
to be misleading. The PSX sold through (to customers) 9796 of its stock, while many Saturns were still sitting on the 
shelves. These misleading numbers were to be quoted by Sega on many occasions, and continued even after the US 
launch. 


1995 Setting Up House 


By mid-1995, Sony had set its sights firmly on the United States. Sony Computer Entertainment of America 
was created and housed in Foster City, California, in the heart of Silicon Valley. Steve Race, formerly of Atari, was 
appointed as president and CEO of the new branch of Sony. The accumulation of third party developers continued 
apace, with over 100 licenses in the US and 270 licenses in Japan secured. Steve Race said, "We've allowed people 
to come in and to play on the PlayStation - and at a much more reasonable cost than has been done in the old days 
with Nintendo and Sega." Sony's development strategy had paid off, with over 700 development units having been 
shipped out worldwide. 


May 11, 1995 Victory At E3 


The Electronic Entertainment Expo (ЕЗ) was held in Los Angeles from May 11 to 13, 1995, and was the 
United State's first real look at the PlayStation. Sony made a huge impression at the show with their (rumored) $4 
million booth and surprise appearance by Michael Jackson. The PSX was definitely the highlight of the show, 
besting Sega's Saturn and Nintendo's laughable Virtual Boy. 

The launch software was also displayed, with WipeOut and Namco's Tekken and Ridge Racer drawing the 
most praise. Sony also announced the unit would not be bundled with Ridge Racer, as was previously assumed. 

Overall, Sony made a very formidable showing at E3. They had already proven themselves in Japan and 
were close on Sega's heels. Over the course of the next year they would overtake Sega and conquer Japan as their 
own. Now they were poised to do the same in America. 


September 9, 1995 You Are Not Ready 


The PlayStation launched in the United States on September 9, 1995 to instant success. Although it retailed 
for $299, that was still $100 less than the Sega Saturn. Over 100,000 units were already presold at launch, and 17 
games were available. Stores reported sell-outs across the country, and sold out of many games and peripherals as 
well, including second controllers and memory cards. 

Sony's initial marketing strategy seemed to be aimed at an older audience than the traditional 8-16 year old 
demographic of the past. With the tag line "U R Not E" (the "E" being red) and a rumored $40 million to spend on 
launch marketing, Sony swiftly positioned itself as the market leader. To further cement their audience demographic, 
Sony sponsored the 1995 MTV Music Awards. 


Epilogue What A Year 


By the US launch, Sony had sold over one million PlayStations in Japan alone. Since the US launch, as of late 1996, 
the PlayStation has sold over 7 million units worldwide, with close to two million of those being in the US alone. In 
May of 1996, Sony dropped the price of the PlayStation to $199, making it even more attractive to buy. 

Like Japan, America and Europe embraced the PlayStation as their next-gen console of choice. The 
demographic of PlayStation owners has fallen in years steadily from twenty-somethings to the younger age bracket 
so coveted by Nintendo. In fact, many former Nintendo loyalists, tired of waiting for the Nintendo 64 to be released, 
bought PlayStations and are now happier for it. With close to 200 games available by Christmas 1996, it's easy to see 
why. This really is the ultimate gaming console! 


The R3000A 


Overview 

The heart of the PSX is a slightly modified R3000A CPU from MIPS and LSI. This is a 32 bit Reduced 
Instruction Set Controller (RISC) processor that clocks at 33.8688 MHz. It has an operating performance of 30 
million instructions per second. In addition, it has an Internal instruction cache of 4 KB, a data cache of 1 KB and 
has a bus transfer rate of 132 MB/sec. It has internally one Arithmetic/Logic unit (ALU), One shifter, and totally 
lacks an FPU or floating point unit. The R3000A is configured for litle-endian byte order and defines a word as 32- 
bits, a half-word, as 16-bits, and a byte as 8-bits. 

The PSX has two coprocessors, сор, the System Control coprocessor, and cop2, the GPU or Graphics 
Processing Unit. These are covered later on in this document. 


Instruction cache 

The PSX's R3000A contains 4 KB of instruction cache. The instruction cache is organized with a line size 
of 16 bytes. This should achieve hit rate of around 80690. The cache is implemented using physical address and tags, 
as opposed to virtual ones. 


Data cache 

The PSX's R3000A incorporates an on-chip data cache of 1KB, organized as a line size of 4 bytes (one 
word). This also should achieve hit rates of 8096 in most applications. This also is a directly mapped physical 
address cache. The data cache is implemented as a write through cache, to maintain that the main memory is the 
same as the internal cache. In order to minimize processor stalls due to data write operations, the bus interface unit 
uses а 4—deep write buffer which captures address and data at the processor execution rate, allowing it to be retired 
to main memory at a much slower rate without impacting system performance. 


32 bit architecture 
The R3000A uses thirty-two 32-bit registers, a 32 bit program counter, and two 32 bit registers for 
multiply/divide functions. The following table lists the registers by register number, name, and usage. 


General Purpose Registers 


Register number 

ZR | —ComtntZero | | 

І 

2-83 
A0-A3 | 


4-R7 0-A3 
[TO-T7 [Гетрогапев (not preserved across call) 
50-57 Saved (preserved across call) 


GP Global Pointer —— | 
FP — [Frame Pointer 


IRA Return address (set by function call) 





Multiply/Divide result Registers and Program counter 
ultiplication 64 bit high result or division remainder 


ILO Multiplication 64 bit low result or division quotient 


PC 


Even though all general purpose registers have different names, they are all treated the same except for two. 
The RO (ZR) register is hardwired as zero. The Second exception is R31 (RA) which is used at a link register when 
link or jump routines are called. These instructions are used in subroutine calls, and the subroutine return address is 
placed in register R31. This register can be written to or read as a normal register in other operations. 


R3000A Instruction set 


The instruction encoding is based on the MIPS architecture. The means that there are three types of 
instruction encoding. 


I-Type (Immediate) 


op м шташ | 


1-Туре (Jump) 


[ор [target 


R-Type (Register) 


op в р fd Маше [шс | 
where: 


5 a 6-bit operation code 
8 a five bit source register specifier 


it s a 5-bit target register or branch condition 


mmediate 
ооо jsa-bitdestimationregisterspecifier | ||| 





The R3000A instruction set can be divided into the following basic groups: 

Load/Store instructions move data between memory and the general registers. They are all encoded аз “I- 
Type" instructions, and the only addressing mode implemented is base register plus signed, immediate offset. This 
directly enables the use of three distinct addressing modes: register plus offset; register direct; and immediate. 

Computational instructions perform arithmetic, logical, and shift operations on values in registers. They 
are encoded as either "R-Type" instructions, when both source operands as well as the result are general registers, 
and "I-Type", when one of the source operands is a 16-bit immediate value. Computational instructions use a three 
address format, so that operations don't needlessly interfere with the contents of source registers. 

Jump and Branch instructions change the control flow of a program. A Jump instruction can be encoded 
as a "J-Type" instruction, in which case the Jump target address is а paged absolute address formed by combining 
the 26-bit immediate value with four bits of the Program Counter. This form is used for subroutine calls. Alternately, 
Jumps can be encoded using the “К-Туре” format, in which case the target address is a 32-bit value contained in one 
of the general registers. This form is typically used for returns and dispatches. Branch operations are encoded as "1- 
Type" instructions. The target address is formed from a 16-bit displacement relative to the Program Counter. The 
Jump and Link instructions save a return address in Register r31. These are typically used as subroutine calls, where 
the subroutine return address is stored into r31 during the call operation. 

Co-Processor instructions perform operations on the co-processor set. Co-Processor Loads and Stores are 
always encoded as “1-Туре” instructions; co-processor operational instructions have co-processor dependent formats. 
In the R3000A, the System Control Co-Processor (cop0) contains registers which are used in memory management 
and exception handling. 


Special instructions perform a variety of tasks, including movement of data between special and general 
registers, system calls, and breakpoint operations. They are always encoded as "R-Type" instructions. 


INSTRUCTION SET SUMMARY 


The following table describes The assembly instructions for the R3000A. Please refer to the appendix for 
more detail about opcode encoding 


Load and Store Instructions 


Load Byte LB rt, offset (base) 
Sign-extend 16-bit offset and add to contents of register base to form address. 
Sign-extend contents of addressed byte and load into rt. 

Load Byte Unsigned LBU rt, offset (base) 
Sign-extend 16-bit offset and add to contents of register base to form address. 
Zero-extend contents of addressed byte and load into rt. 

Load Halfword LH rt, offset (base) 
Sign-extend 16-bit offset and add to contents of register base to form address. 
Sign-extend contents of addressed byte and load into rt. 

| оаа Halfword Unsigned LHU rt, offset (base) 

ШИЕ -extend 16-bit offset and add to contents of register base to form address. 
Zero-extend contents of addressed byte and load into rt. 

Load Word LW rt, offset (base) 
Sign-extend 16-bit offset and add to contents of register base to form address. 
Load contents of addressed word into register rt. 


LWL rt, offset (base) 

Sign-extend 16-bit offset and add to contents of register base to form address. 
Shift addressed word left so that addressed byte is leftmost byte of a word. 
Merge bytes from memory with contents of register rt and load result into register 
rt. 

LWR rt, offset (base) 

Sign-extend 16-bit offset and add to contents of register base to form address. 
Shift addressed word right so that addressed byte is rightmost byte of a word. 
Merge bytes from memory with contents of register rt and load result into register 


SB rt, offset (base) 

Sign-extend 16-bit offset and add to contents of register base to form address. 
Store least significant byte of register rt at addressed location. 

SH rt, offset (base) 

Sign-extend 16-bit offset and add to contents of register base to form address. 
Store least significant halfword of register rt at addressed location. 

SW rt, offset (base) 

Sign-extend 16-bit offset and add to contents of register base to form address. 
Store least significant word of register rt at addressed location. 

SWL rt, offset (base) 

Sign-extend 16-bit offset and add to contents of register base to form address. 
Shift contents of register rt right so that leftmost byte of the word is in position of 
addressed byte. Store bytes containing original data into corresponding bytes at 
addressed byte. 

SWR rt, offset (base) 

Sign-extend 16-bit offset and add to contents of register base to form address. 
Shift contents of register rt left so that rightmost byte of the word is in position of 
addressed byte. Store bytes containing original data into corresponding bytes at 
addressed byte. 





Computational Instructions 


ALU Immediate Operations 


Instruction Format and Description 


ADD Immediate ADDI rt, rs, immediate 


Add 16-bit sign-extended immediate to register rs and place 32-bit result in 
register rt . Trap on two’s complement overflow. 


ADD Immediate Unsigned ADDIU rt, rs, immediate 





Add 16-bit sign-extended immediate to register rs and place 32-bit result in 
register rt . Do not trap on overflow. 

Set on Less Than Immediate SLTI rt, rs, immediate 
Compare 16-bit sign-extended immediate with register rs as signed 32-bit 
integers. Result = 1 if rs is less than immediate; otherwise result = 0. 
Place result in register rt. 

Set on Less Than Unsigned Immediate SLTIU rt, rs, immediate 
Compare 16-bit sign-extended immediate with register rs as unsigned 32-bit 
integers. Result = 1 if rs is less than immediate; otherwise result = 0. Place 
result in register rt. Do not trap on overflow. 


AND Immediate ANDI rt, rs, immediate 

БОЛТ —— 16-bit immediate, AND with contents of register rs and place result 
in register rt. 

ОН Immediate ОН! rt, rs, immediate 
Zero-extend 16-bit immediate, OR with contents of register rs and place result in 
register rt. 

Exclusive OR Immediate ХОН! rt, rs, immediate 

А 16-bit immediate, exclusive OR with contents of register rs and 
place result in register rt. 

Load Upper Immediate LUI rt, immediate 

pee BE 16-bit immediate left 16 bits. Set least significant 16 bits of word to zeroes. 
Store result in register rt. 


Three Operand Register-Type Operations 


Instruction Format and Description 


A ADD rd, rs, rt 
Add contents of registers rs and rt and place 32-bit result in register rd. Trap on 
wo's complement overflow. 

мал й х | 
Add contents of registers rs and rt and place 32-bit result in register га. Do not 
rap on overflow. 

Subtract SUB rd, rs, rt 

“= Bin contents of registers rt and rs and place 32-bit result in register rd. Trap 
Оп two's complement overflow. 

Subtract Unsigned SUBU rd, rs, rt 
Subtract contents of registers rt and rs and place 32-bit result in register rd. Do 
по! trap on overflow. 

Set on Less Than SLT rd, rs, rt 

ШІ. contents ої register rt to register rs (as signed 32-bit integers). 
If register rs is less than rt, result = 1; otherwise, result = 0. 

Set on Less Than Unsigned SLTU rd, rs, rt 
Compare contents of register rt to register rs (as unsigned 32-bit integers). If 
register rs is less than rt, result = 1; otherwise, result = 0. 
Bit-wise AND contents of registers rs and rt and place result in register rd. 
Bit-wise OR contents of registers rs and rt and place result in register rd. 


[Exclusive OR IXOR rd, rs, rt 
Bit-wise Exclusive OR contents of registers rs and rt and place result in register 
rd. 


NOR МОВ rd, rs, rt 
Bit-wise NOR contents of registers rs and rt and place result in register rd. 


Shift Operations 


Format and Description 


Shift Left Logical SLL rd, rt, shamt 
Shift contents of register rt left by shamt bits, inserting zeroes into low order bits. 
Place 32-bit result in register rd. 








Shift Right Logical SRL rd, rt, shamt 
Shift contents of register rt right by shamt bits, inserting zeroes into high order 
bits. Place 32-bit result in register rd. 


Shift Right Arithmetic SRA rd, rt, shamt 
Shift contents of register rt right by shamt bits, sign-extending the high order bits. 
Place 32-bit result in register rd. 


Shift Left Logical Variable SLLV rd, rt, rs 





Shift contents of register rt left. Low-order 5 bits of register rs specify number of 
bits to shift. Insert zeroes into low order bits of rt and place 32-bit result in 


Shift Right Logical Variable 


Shift contents of register rt right. Low-order 5 bits of register rs specify number of 
bits to shift. Insert zeroes into high order bits of rt and place 32-bit result in 


Shift Right Arithmetic Variable 
Shift contents of register rt right. Low-order 5 bits of register rs specify number of 
bits to shift. Sign-extend the high order bits of rt and place 32-bit result in register 
rd. 





Multiply and Divide Operations 


Multiply contents of registers rs and rt as twos complement values. Place 64-bit 
result in special registers HI/LO 

Multiply Unsigned MULTU rs, rt 

Мин: contents of registers rs and rt as unsigned values. Place 64-bit result in 
special registers HI/LO 
Divide contents of register rs by rt treating operands as twos complements 
values. Place 32-bit quotient in special register LO, and 32-bit remainder in HI. 
Divide contents of register rs by rt treating operands as unsigned values. Place 
32-bit quotient in special register LO, and 32-bit remainder in HI. 
Move contents of special register HI to register rd. 
Move contents of special register LO to register rd 


Move contents of special register rd to special register HI. 
Move contents of register rd to special register LO. 


Jump and Branch Instructions 





Jump Instructions 


Format and Description 
ump 


J target 
Shift 26-bit target address left two bits, combine with high-order4 bits of PC and 
jump to address with a one instruction delay. 


ump and Link JAL target 
Shift 26-bit target address left two bits, combine with high-order 4 bits of PC and 
jump to address with a one instruction delay. Place address of instruction 


following delay slot in r31 (link register). 


Jump to address contained in register rs with a one instruction delay 
Jump to address contained in register rs with a one instruction delay. Place 
address of instruction following delay slot in rd. 


Branch Instructions 


Instruction Format and Description 


Branch Target: All Branch instruction target addresses are computed as follows: 
Add address of instruction in delay slot and the 16-bit offset (shifted left two bits 
and sign-extended to 32 bits). All branches occur with a delay of one instruction. 





Branch to target address if register rs equal to rt 
Branch to target address if register rs not equal to rt. 
Branch to target address if register rs less than or equal to 0. 





Branch on Greater Than Zero BGTZ rs, offset 
Branch to target address if register rs greater than 0. 


Branch to target address if register rs less than 0. 
Branch to target address if register rs greater than or equal to 0. 


Branch on Less Than Zero And Link BLTZAL rs, offset 
Place address of instruction following delay slot in register r31 (link register). 
Branch to target address if register rs less than 0. 

Branch on greater than or Equal Zero And BGEZAL rs, offset 

Link Place address of instruction following delay slot in register r31 (link register). 
Branch to target address if register rs is greater than or equal to 0. 


Special Instructions 


System Call SYSCALL 

== Fe system call trap, immediately transferring control to exception handler. 
More information on the PSX SYSCALL routines are covered later on. 
Initiates breakpoint trap, immediately transferring control to exception handler. 


More information on the PSX SYSCALL routines are covered later on. 








Co-processor Instructions 


Load Word to Co-processor LWCz rt, offset (base) 

ма. 16-bit offset апа add to base to form address. Load contents of 
laddressed word into co-processor register rt of co-processor unit 7. 
Sign-extend 16-bit offset and add to base to form address. Store contents of co- 
processor register rt from co-processor unit z at addressed memory word. 

Move To Co-processor MTCz rt, rd 

И contents of CPU register rt into co-processor register rd of co-processor 
unit z. 

Move from Co-processor МЕС? rt,rd 
Move contents of co-processor register rd from co-processor unit z to CPU 
register rt. 

Move Control To Co-processor СТС? rt,rd 
Move contents of CPU register rt into co-processor control register rd of co- 
processor unit z. 


Move contents of control register rd of co-processor unit z into CPU register rt. 
Move Control To Co-processor COPz cofun 

Co-processor z performs an operation. The state of the ВЗОООА is not modified 

by a co-processor operation. 


System Control Co-processor (СОРО) Instructions 


nstruction Format and Description 


Move To CPO MTCO rt, rd 
Store contents of CPU register rt into register rd of CPO. This follows the 
convention of store operations. 





Load CPU register rt with contents of CPO register rd. 
Load EntryHi and EntryLo registers with TLB entry pointed at by Index register. 
Load TLB entry pointed at by Index register with contents of EntryHi and EntryLo 





rite Random TLB Entry ITLBWR 
Load TLB entry pointed at by Random register with contents of EntryHi and 
EntryLo registers. 


Probe TLB for Matching Entry 


Entry Load Index register with address of TLB entry whose contents match 
EntryHi and EntryLo. If no TLB entry matches, set high-order bit of Index 
register. 


Restore From Exception 
Restore previous interrupt mask and mode bits of status register into current 
status bits. Restore old status bits into previous status bits. 





R3000A OPCODE ENCODING 


The following shows the opcode encoding for the MIPS architecture. 








OPCODE 
Bits 28...26 
38...29 0 1 2 3 4 5 6 7 
0 
1 | ADD! | ADDIU | Str | 570 | AND | ORI | XORI | LU | 
g 
3 
4 
5 
6 
7 
SPECIAL 
Bits 2...0 
5...3 0 1 2 3 4 5 6 7 
0 
1 
2 
3 
4 | ADD | ADDU | SUB | 5080 | AND | OR | XOR | мов | 
5 
6 
7 
BCOND 
Bits 8...16 
20...19 0 1 2 3 4 5 6 7 
0 [| вт | Beez | оо | р o = 3p --4 
1 Е зр Е шш М л НИ 1 
2 BGEZAE.|.- — --[:——-——]- — -— 3 — | —— 
COPz 
Bits 23...21 
25...24 0 1 2 3 4 5 6 7 
0 [МЕ x]. ~= { с. р м... |= Е: 
1 


Co-Processor Specific Operations 


СОРО 





Memory 


Overview 

The PSX's memory consists of four 512k 60ns SRAM chips creating 2 megabytes of system memory. The 
RAM is arranged so that the addresses at ОхООхххххх, Ox AO0xxxxxx, Ох80хххххх all point to the same physical 
memory. The PSX has a special coprocessor called сор that handles almost every aspect of memory management. 
Let us first examine how the memory looks and then how it is managed. 


The PSX Memory Map 


0x0000. 0000-0х0000 НЕ |Kernel (64K) 


0х0001 0000 
User Memory (1.9 Meg) 
OxOO1f ffff 


x1f00. 0000-0x1f00 ffff (Parallel Port (64K) 
х 1180 0000-0х 1180 O3ff Scratch Pad (1024 bytes) 
х 1180 1000-0х 1180 2fff {Hardware Registers (ЗК) 


Kernel and User Memory Mirror (2 Meg) 
(Cached 


Ox801f_ffff 


Оха000 0000 Kernel and User Memory Mirror (2 Meg) 
Uncached 
OxaO01f ffff 


xbfcO 0000-Oxbfc7 ffff [BIOS (512K) 


АП blank areas represent the absence of memory. The mirrors are used mostly for caching and exception 
handling purposes The Kernel is also mirrored in all three user memory spaces. 





Virtual Memory 

The PSX uses a memory architecture known as “Virtual Memory" to help with general system memory and 
cache management. In a nutshell what the PSX does is mirror the two meg of addressable space into 3 segments at 
three different virtual addresses. The names of these segments are Kuseg, Kseg0, and Ksegl. 

Кибер spans from ОХО000 0000 to 0x001f_ffff. This is what you might call “real” memory. This facilitates 
the kernel having direct access to user memory regions. 

Kseg0 begins at virtual address 0x8000 0000 and goes to Ox801f ffff. This segment is always translated to 
a linear 2MB region of the physical address space starting at physical address 0. АП references through this segment 
are cacheable. When the most significant three bits of the virtual address are “100”, the virtual address resides in 
kseg0. The physical address is constructed by replacing these three bits of the virtual address with the value “000”. 

Ksegl is also a linear 2MB region from Оха000 0000 to Oxa01f_ffff pointing to the same address at 
address 0. When the most significant three bits of the virtual address are “101”, the virtual address resides in ksegl. 
The physical address is constructed by replacing these three bits of the virtual address with the value *000". Unlike 
kseg0, references through kseg1 are not cacheable. 


Looking a little deeper into how virtual memory works, the following shows the anatomy of an R3000A 
virtual address. The most significant 20 bits of the 32-bit virtual address are called the virtual page number, or VPN. 
Only the three highest bits (segment number) are involved in the virtual to physical address translation. 


31 0 
17 УРМ 
31 30 29 20 12 
bits 31-29 
Oxx kuseg 
100 Квее0 
101 ksegl 


The three most significant bits of the virtual address identify which virtual address segment the processor is 
currently referencing; these segments have associated with them the mapping algorithm to be employed, and whether 
virtual addresses in that segment may reside in the cache. Pages are mapped by substituting a 20-bit physical frame 
number (PFN) for the 20-bit virtual page number field of the virtual address. This substitution is performed through 
the use of the on-chip Translation Lookaside Buffer (TLB). The TLB is a fully associative memory that holds 64 
entries to provide a mapping of 64 4kB pages. When a virtual reference to kuseg each TLB entry is probed to see if 
it maps the corresponding VPN. 


Virtual to physical memory translation 
The following table is a quick look at how virtual memory gets translated via the Translation 
Lookaside Buffer. This whole subsystem of memory management is handled by Cop0. 


Current 
Process ID Program Counter 

Virtual 
Address 

63 

62 

61 

60 

e CAM 

| » (Content Addressable 

b Memory) 

3 

2 

1 

0 
Physical 
Address 





Cop0, The System Control Coprocessor 
This Unit is actually part of the R3000A. This particular copO has been modified from the original R3000A 
сорО architecture with the addition of a few registers and functions. CopO contains 16 32-bit control registers that 


control the various aspects of memory management, system interrupt (exception) management, and breakpoints. 
Much of it is compatible with the normal R3000A сорО. The following is an overview of the Cop0 registers. 


соро Registers 


Number [Mnemonic Name [Read/Write [Usage | | 
a жин /w | = РО з to an entry in the Index to an entry in the 64-entry TLB file | | | | | | ТІВ file 


зз рела 0200 software with a “suggested” random TLB entry| 
to be written with the correct translation. 

П р" |" roc RR 
ог probe the TED file (first 32 bits) 

M e E е 

ontext Duplicates information in the BADV register, but 

provides this information in a form that may be more 
useful for a software TLB exception handler. 


5 | . |BDA  Breakpontdata Им $ ets the breakpoint address for load/store operations 
6 _ |  PIDMASK PPID Mask rw [Process ID mask 


IDCIC IData/Counter г/уу Breakpoint control 
nterrupt control 


BENE санан хан 
Address exception. 
Бас ре =" eee o on 
compared to the value in BDA 
шалан Б VUE RM 
or probe the TLR file (second 32 bits) 
ounter mask compared to the value in ВРС. 
с em Le 
register 


H3 | AUSE с fp D escribes the most recently recognized exception 


14 IEPC IEXception (Contains the return address after an exception 
Program ой 


15 [PRID | Робер | 
кна И 


Note that some of these registers will be explained later in the part on exception handling. But for now we 
will return to how the CopO is used in memory management. 


орд type and revision level 





Returning to the TLB 

As stated before the TLB is a fully associative memory that holds 64 entries to provide a mapping of 64 
4КВ pages. Each TLB entry is 64 bits wide. This is referenced by the Index, Random, TBL high, and TBL low. It is 
used to virtual to physical address mapping. 


The Index Register 

The Index register is a 32-bit, read-write register, which has a 6-bit field used to index to a specific entry in 
the 64-entry TLB file. The high-order bit of the register is a status bit which reflects the success or failure of a TLB 
Probe (tlbp) instruction.. The Index register also specifies the TLB entry that will be affected by the TLB Read (tlbr) 
and TLB Write Index (tlbwi) instructions. the following shows the format of the Index register. 


31 30 1413 87 0 


ppt. uw. mue o0 — E 


1 17 6 8 


Р Probe failure. Set to 1 when the last TLBProbe (tlbp) instruction was unsuccessful. 
Index Index to the ТІВ entry that will be affected by the TLBRead апа TLBWirite instructions. 
0 Reserved. Must be written as zero, returns zero when read. 


The Random Register 

The Random register is a 32-bit read-only register. The format of the Random register is below. The six-bit 
Random field indexes a Random entry in the TLB. It is basically a counter which decrements on every clock cycle, 
but which is constrained to count in the range of 63 to 8. That is, software is guaranteed that the Random register will 
never index into the first 8 TLB entries. These entries can be "locked" by software into the TLB file, guaranteeing 
that no TLB miss exceptions will occur in operations which use those virtual address. This is useful for particularly 
critical areas of the operating system. 


о | Кабот | 0 | 


18 6 8 
Капдот A random index (with a value from 8 to 63) to a TLB entry. 
0 Reserved. Returns zero when read. 


The Random register is typically used in the processing of a TLB miss exception. The Random register 
provides software with a "suggested" TLB entry to be written with the correct translation; although slightly less 
efficient than a Least Recently Used (LRU) algorithm, Random replacement offers substantially similar performance 
while allowing dramatically simpler hardware and software management. To perform a TLB replacement, the TLB 
Write Random (tlbwr) instruction is used to write the TLB entry indexed by this register. At reset, this counter is 
preset to the value ‘63’. Thus, it is possible for two processors to operate in "lock-step", even when using the 
Random TLB replacement algorithm. Also, software may directly read this register, although this feature probably 
has little utility outside of device testing and diagnostics. 


TBL High and TBL Low Registers 
These two registers provide the data path for operations which read, write, or probe the TLB file. The 
format of these registers is the same as the format of a TLB entry. 


VPN | BD | 0 | || FN — NI D [VIG] 0 
1 1 1 8 


20 6 6 20 1 


VPN Virtual Page Number. Bits 31..12 of virtual address. 

PID Process ID field. A 6-bit field which lets multiple processes share the TLB while each process has a distinct 
mapping of otherwise identical virtual page numbers. 

PFN Page Frame Number. Bits 31..12 of the physical address. 


N Non-cacheable. If this bit is set, the page is marked as non-cacheable 

D Dirty. If this bit is set, the page is marked as "dirty" and therefore writable. This bit is actually a "write- 
protect" bit that software can use to prevent alteration of data 

V Valid. If this bit is set, it indicates that the TLB entry is valid; otherwise, a TLBL or TLBS Miss occurs. 
G Global. If this bit is set, the R3000A ignores the PID match requirement for valid translation. In kseg2, the 
Global bit lets the kernel access all mapped data without requiring it to save or restore PID (Process ID) values. 

0 Reserved. Must be written as '0', returns '0' when read. 

Exception Handling 


There are times when in is necessary to suspend a program in order to process a hardware or software 
function. The exception processing capability of the КЗОООА is provided to assure an orderly transfer of control from 
an executing program to the kernel. Exceptions may be broadly divided into two categories: they can be caused by an 


instruction or instruction sequence, including an unusual condition arising during its execution; or can be caused by 

external events such as interrupts. When an R3000A detects an exception, the normal sequence of instruction flow is 
suspended; the processor is forced to kernel mode where it can respond to the abnormal or asynchronous event. The 
table below lists the exceptions recognized by the R3000A. 


Mnemonic Cause 


Reset Assertion of the Reset signal causes an exception 
that transfers control to the special vector at virtual 
address Oxbfc0_0000 (The start of the BIOS) 


Exception | 
Bus Error ЇВЕ Assertion of the Bus Error input during a read 
IDBE (Data) operation, due to such external events as bus 
timeout, backplane memory errors, invalid physical 
address, or invalid access types. 


Exception 
1 Г) 


Р. 
a 
= 
e 
DR 
л 
Ej 
= 
Я 
© 
= 


IAdEL (Load) Attempt to load, fetch, or store an unaligned word; 

AdES (Store) that is, a word or halfword at an address not evenly 
divisible by four or two, respectively. Also caused 
by reference to a virtual address with most 
significant bit set while in User Mode. 


Overflow | |  |Ovf Twos complement overflow during add or subtract. 
ystem Call Execution of the SYSCALL Trap Instruction 


Execution of the break instruction 


IR I Execution of an instruction with an undefined or 
Instruction reserved major operation code (bits 31:26), or a 
special instruction whose minor opcode (bits 5:0) is 
undefined. 
CpU Execution of a co-processor instruction when the 
CU (Co-processor usable) bit is not set for the 
target co-processor. 


LB Miss TLBL (Load) A referenced TLB entry’s Valid bit isn’t set 
TLBS (Store) 


LB Modified [Моа During a store instruction, the Valid bit is set but 
the dirty bit is not set in a matching TLB entry. 


Interrupt Int Assertion of one of the six hardware interrupt 
inputs or setting of one of the two software 
interrupt bits in the Cause register. 


Returning to the Cop0 

The Cop0 controls the exception handling with the use of the Cause register, the EPC register, the Status 
register, the BADV register, and the Context register. А brief description of each follows, after which the rest of the 
CopO registers for breakpoint management will be described for the sake of completeness. 


[#2] 
| 
wa 





The Cause Register 

The contents of the Cause register describe the last exception. A 5-bit exception code indicates the cause of the 
current exception; the remaining fields contain detailed information specific to certain exceptions. All bits in this 
register, with the exception of the SW bits, are read-only. 





31 0 
[sojo] æ] о | P | sw | о | EXECODE | 0 | 
1 1 2 12 6 2 1 5 2 
BD Branch Delay. The Branch Delay bit is set (1) if the last exception was taken while the 


processor was executing in the branch delay slot. If so, then the EPC will be rolled back to point to the branch 
instruction, so that it can be re-executed and the branch direction re-determined.. 


СЕ Coprocessor Error, Contains the coprocessor number if the exception occurred because of a 
coprocessor instruction for a coprocessor which wasn't enabled in SR. 


IP Interrupts Pending. It indicates which interrupts are pending. Regardless of which interrupts are 
masked, the IP field can be used to determine which interrupts are pending. 
SW Software Interrupts. The SW bits can be written to set or reset software interrupts. As long as any 


of the bits are set within the SW field they will cause an interrupt if the corresponding bit is set in SR under the 
interrupt mask field. 

0 Reserved, Must Be Written as 0. Returns 0 when Read 

EXECODE Exception Code Field. Describes the type of exception that occurred. The following table lists the 
type of exception that it was. 


POINT External Interrupt 00000000 
| 6 | Е | |Ваз Error Exception (for Instruction Fetch) ||| 


| 8 | _ SYS ЗАМ Exception | | | | ||| 
| 9 | BP JpBrakpontExepton 00000000) 
| 1331 | Reserved i | 





The EPC (Exception Program Counter) Register 

The 32-bit EPC register contains the virtual address of the instruction which took the exception, from which 
point processing resumes after the exception has been serviced. When the virtual address of the instruction resides in 
a branch delay slot, the EPC contains the virtual address of the instruction immediately preceding the exception (that 
is, the EPC points to the Branch or Jump instruction). 


BADV Register 
The BADV register saves the entire bad virtual address for any addressing exception. 


Context Register 

The Context register duplicates some of the information in the BADV register, but provides this information 
in а form that may be more useful for a software TLB exception handler. The following illustrates the layout of the 
Context register. The Context register is used to allow software to quickly determine the main memory address of the 
page table entry corresponding to the bad virtual address, and allows the TLB to be updated by software very quickly 
(using a nine-instruction code sequence). 


PTE Base BADV Го | 


11 19 2 
0 Reserved, read as 0 and must be written as 0 
BADV Failing virtual page number (set by hardware read only derived from ВАДУ register 


PTE Base Base address of page table entry, set by the kernel 


The Status Register 

The Status register contains all the major status bits; any exception puts the system in Kernel mode. АП bits 
in the status register, with the exception of the TS (TLB Shutdown) bit, are readable and writable; the TS bit is read- 
only. Figure 5.4 shows the functionality of the various bits in the status register. The status register contains a three 
level stack (current, previous, and old) of the kernel/user mode bit (KU) and the interrupt enable (IE) bit. The stack 
is pushed when each exception is taken, and popped by the Restore From Exception instruction. These bits may also 
be directly read or written. At reset, the SWc, KUc, and IEc bits are set to zero; BEV is set to one; and the value of 
the TS bit is set to 0 (TS = 0) The rest of the bit fields are undefined after reset. 


3l 0 
| cu [о [ вв [0 | ВЕМ | Ts | РЕ | CM | PZ | SwC | КС | IntMask |0 | KUo | Œo | кор | Ep | кос | Ес | 


4 2 1 2 1 1 1 1 1 1 1 8 2, 1 1 1 1 1 1 





The various bits of the status register are defined as follows: 


CU Co-processor Usability. These bits individually control user level access to co-processor operations, 
including the polling of the BrCond input port and the manipulation of the System Control Co-processor (СРО). CU2 
is for the СТЕ, CUI is for the FPA, which is not available in the PSX. 


RE Reverse Endianness. The R3000A allows the system to determine the byte ordering convention for the 
Kernel mode, and the default setting for user mode, at reset time. If this bit is cleared, the endianness 

defined at reset is used for the current user task. If this bit is set, then the user task will operate with the opposite byte 
ordering convention from that determined at reset. This bit has no effect on kernel mode. 


BEV Bootstrap Exception Vector. The value of this bit determines the locations of the exception vectors of the 
processor. If BEV = 1, then the processor is in “Bootstrap” mode, and the exception vectors reside 


in the BIOS ROM. If BEV = 0, then the processor is in normal mode, and the exception vectors reside in RAM. 


TS TLB Shutdown. This bit reflects whether the TLB is functioning. 


PE Parity Error. This field should be written with a "1" at boot time. Once initialized, this field will always be 
read as "0'. 
CM Cache Miss. This bit is set if a cache miss occurred while the cache was isolated. It is useful in determining 


the size and operation of the internal cache subsystem. 
PZ Parity Zero. This field should always be written with a "0". 


SwC Swap Caches. Setting this bit causes the execution core to use the on-chip instruction cache as a data cache 
and vice-versa. Resetting the bit to zero unswaps the caches. This is useful for certain operations 
such as instruction cache flushing. This feature is not intended for normal operation with the caches swapped. 


IsC Isolate Cache. If this bit is set, the data cache is "isolated" from main memory; that is, store operations 
modify the data cache but do not cause a main memory write to occur, and load operations return the data value from 
the cache whether or not a cache hit occurred. This bit is also useful in various operations such as flushing. 


IM Interrupt Mask. This 8-bit field can be used to mask the hardware and software interrupts to the execution 
engine (that is, not allow them to cause an exception). IM(1:0) are used to mask the software interrupts, and IM (7:2) 
mask the 6 external interrupts. A value of ‘0’ disables a particular interrupt, and a “17 enables it. Note that the IE bit 
is a global interrupt enable; that is, if the IE is used to disable interrupts, the value of particular mask bits is 
irrelevant; if IE enables interrupts, then a particular interrupt is selectively masked by this field. 


KUo  kKernel/User old. This is the privilege state two exceptions previously. A ‘0’ indicates kernel mode. 


IEo Interrupt Enable old. This is the global interrupt enable state two exceptions previously. A ‘1’ indicates that 
interrupts were enabled, subject to the IM mask. 


KUp  kKernel/User previous. This is the privilege state prior to the current exception A ‘0’ indicates kernel mode. 


IEp Interrupt Enable previous. This is the global interrupt enable state prior to the current exception. A ‘1’ 
indicates that interrupts were enabled, subject to the IM mask. 


KUc — Kernel/User current. This is the current privilege state. A ‘0’ indicates kernel mode. 


IEc Interrupt Enable current. This is the current global interrupt enable state. A ‘1’ indicates that interrupts are 
enabled, subject to the IM mask. 
0 Fields indicated as ‘0’ are reserved; they must be written as ‘0’, and will 


return ‘0’ when read. 


PRID Register 
This register is useful to software in determining which revision of the processor is executing the code. The format of 
this register is illustrated below. 


= | шр Кеу 


16 8 8 
Пар 3 СоРО їуре К3000А 
7 IDT unique (3041) use REV to determine correct configuration. 
Rev Revision level. 


EXCEPTION VECTOR LOCATIONS 

The R3000A separates exceptions into three vector spaces. The value of each vector depends on the BEV 
(Boot Exception Vector) bit of the status register, which allows two alternate sets of vectors (and thus two different 
pieces of code) to be used. Typically, this is used to allow diagnostic tests to occur before the functionality of the 
cache is validated; processor reset forces the value of the BEV bit to a 1. 


Virtual Address Physical Address 


OxbfcO0 0000 Ox1fcO 0000 
UTLB Miss 0x8000 0000 0x0000 0000 
0х8000 0080 0x0000 0080 


Exception Vectors When BEV = 0 


Virtual Address Physical Address 
Шан Reet — Oxbfe0_0000 Ox1fe0_0000 





UTLB Miss OxbfcO 0100 Ox1fcO 0100 
OxbfcO 0180 Ox1fc0 0180 


Exception Vectors When BEV =1 





Exception Priority 
The following is a priority list of exceptions: 


Reset At any time (highest) 
AdEL Memory (Load instruction) 
AdES Memory (Store instruction) 
DBE Memory (Load or store) 
МОО ALU (Data ТІ В) 

TLBL ALU (DTLB Miss) 


TLBS ALU (DTLB Miss) 

Ovf ALU 

Int ALU 

Sys RD (Instruction Decode) 

Bp RD (Instruction Decode) 

RI RD (Instruction Decode) 

CpU RD (Instruction Decode) 

TLBL  I-Fetch (ITLB Miss) 

AdEL IVA (Instruction Virtual Address) 
IBE RD (end of I-Fetch, lowest) 


Breakpoint Management 
The following is a listing of the registers in CopO that are used for breakpoint management. These registers 
are very useful for low-level debugging. 


BPC 
Breakpoint on execute. Sets the breakpoint address to break on execute. 


BDA 
Breakpoint on data access. Sets the breakpoint address for load/store operations 


DCIC 

Breakpoint control. To use the Execution breakpoint, set PC. To use the Data access 

breakpoint set DA and either R, W or both. Both breakpoints can be used simultaneously. When a breakpoint occurs 
the PSX jumps to 0х0000 0040. 


1Пп|110рм|в | БА [р 
11111 1 1 1 


1 23 
уу 0 
1 Break оп Write 
R 0 
1 Break on Read 
DA 0 Data access breakpoint disabled 
1 Data access breakpoint enabled 
PC 0 Execution breakpoint disabled 
1 Execution breakpoint enabled 
BDAM 


Data Access breakpoint mask. Data fetch address is ANDed with this value and then compared 
to the value in BDA 


BPCM 
Execute breakpoint mask. Program counter is ANDed with this value and then compared to 
the value in BPC. 


DMA 


From time to time the PSX will need to take the CPU off the main bus in order to give a device access 
directly to Memory. The devices able to take control of the bus are the CD-ROM, MDEC, GPU, SPU, and the 


Parallel port. There are 7 ОМА channels in all (The GPU and MDEC use two) The ОМА registers reside between 
Ox1f80 1080 and 0х1180 10f4. The DMA channel registers are located starting at Ox1f80. 1080. The base address 
for each channel is as follows 


Base Address 
x1f80. 1080 
x1f80. 1090 
x1f80. 1040 


x1f80 1050 IDMA channel 3 CD-ROM 
x1f80_10c0 DMA channel 4 
x1f80_10d0 DMA channel 5 IO 


x1f80_10e0 DMA channel 6 PU OTC (reverse clear the Ordering Table) 





Each channel has three 32-bit control registers at a offset of the base address for that particular channel. 
These registers are the DMA Memory Address Register (D МАРК) at the base address, DMA Block Control 
Register (D. BCR)at base+4, and the DMA Channel Control Register (D CHCR) at base+8. 

In order to use DMA the appropriate channel must be enabled. This is done using the DMA Primary 
Control Register (ОРСК) located at Ох 1180: 1010. 


DMA Primary Control Register (DPCR) 0x1f80 10f0 


| | | DMA6 | рмал5 | DMA4 | DMA3 | DMA2 | ОМА! | DMAO | 
4 4 4 4 4 4 4 4 


Each register has a 4 bit control block allocated in this register. 
ВИЗ  1- ОМА Enabled 


2 Unknown 
1 Unknown 
0 Unknown 


Bit 3 must be set for a channel to operate. 


As stated above, each device has three 32-bit control registers within it’s own DMA address space. The 
following describes their functions. The n represents 8,9,a,b,c,d,e for DMA channels 0,1,2,3,4,5,6 respectively. 


DMA Memory Address Register (D MADR) 0x1f80 10n0 


3l 


MADR 


MADR Pointer to the virtual address the DMA will start reading from/writing to. 


© 


ОМА Block Control Register (D BCR) 0x1f80 10n4 


і 


31 
BS 


16 16 


BA Amount of blocks 
BS Blocksize (words) 


The channel will transfer BA blocks of BS words. Take care not to set the size larger than the buffer of the 
corresponding unit can hold. (GPU & SPU both have a $10 word buffer). A larger blocksize, means a faster transfer. 


DMA Channel Control Register (D CHCR) 0x1f80 10n8 
31 0 
[L9 арава — 0 — 0] w DR] 
7 1 13 1 1 8 1 
ТЕ 0 Мо ОМА transfer busy. 
1 Start ОМА transfer/DMA transfer busy. 
LR 1 Transfer linked list. (GPU only) 
CO 1 Transfer continuous stream of data. 
DR 1 Direction from memory 
0 Direction from memory 


The last register is used to control DMA interrupts. The usage is currently unknown. 


DMA Interrupt Control Register (DICR) 0x1f80 10f4 


Video 


Overview 

The GPU is the unit responsible for the graphical output of the PSX. It handles display and drawing of all 
graphics. It has the control over an 1МВ frame buffer, which at 16 bits per pixel gives you a maximum "surface" of 
1024x512 resolution. It also contains a 2Kb texture cache for increased speed. The display can be set for 15-bit 
color or 24-bit color. 

Because the PSX also totally lacks an FPU. А second coprocessor has been added called the Geometry 
Transformation Engine or GTE. The GTE is the heart of all 3d calculations on the PSX. The GTE can perform 
vector and matrix operations, perspective transformation, color equations and the like. It is much faster than the CPU 
on these operations. It is mounted as the second coprocessor (Cop2) and as such takes up no physical address space 
in the PSX. The GTE is covered later in the document. 


The Graphics Processing Unit (GPU) 


As stated before the GPU is responsible for graphical output. It has at it's disposal a 1 MB frame buffer and 
registers to access it. The frame buffer it totally inaccessible to the CPU, meaning that it doesn't reside in 
addressable memory. The only way to access it is through the GPU. The GPU is able to take “commands” from the 
CPU, or via DMA to place objects on the frame buffer to be displayed. Communication is handled through a 
command and data port. It has a 64 byte command FIFO buffer, which can hold up to 3 commands and is connected 
to a DMA channel for transfer of image data and linked command lists (channel 2) and a DMA channel for reverse 
clearing an Ordering Table (channel 6). 


Communication and Ordering Tables (OT). 

АП data regarding drawing and drawing environment are sent as packets to the GPU. Each packet tells the 
GPU how and where to draw one primitive, or it sets one of the drawing environment parameters. The display 
environment is set up through single word commands using the control port of the GPU. 

Packets can be forwarded word by word through the data port of the GPU, or more efficiently for large 
numbers of packets through DMA. A special DMA mode was created for this so large numbers of packets can be 
sent and managed easily. In this mode a list of packets is sent, where each entry in the list contains a header which is 
one word containing the address of the next entry and the size of the packet and the packet itself. A result of this is 
that the packets do not need to be stored sequentially. This makes it possible to easily control the order in which 
packets get processed. The GPU processes the packets it gets in the order they are offered. So the first entry in the 
list also gets drawn first. To insert a packet into the middle of the list simply find the packet after which needs it to be 
processed, replace the address in that packet with the address of the new packet, and let that point to the address that 
was replaced. 

To aid in finding a location in the list, the Ordering Table was invented. At first this is basically a linked list 
with entries of packet size 0, so it's a list of only list entry headers, where each entry points to to the next entry. Then 
as primitives are generated by your program you can then add them to the table at a certain index. Just read the 
address in the table entry and replace it with the address of the new packet and store the address from the table in the 
packet. When all packets are generated drawing will just require passing the address of the first list entry to the DMA 
and the packets will get drawn in the order you entered the packets to the table. Packets entered at a higher table 
index will get drawn after those entered at a lower table index. Packets entered at the same index will get drawn in 
the order they were entered, the last one first. 

In 3d drawing it's most common that you want the primitives with the highest Z value to be drawn first, so it 
would be nice if the table would be drawn the other way around, so the Z value can be used as index. This is a simple 
thing, just make a table of which each entry points to the previous entry, and start the DMA with the address of the 
last table entry. To assist you in making such a table, a special DMA channel is available which creates it for you. 


The Frame Buffer 

The frame buffer is the memory which stores all graphic data which the GPU can access and manipulate, 
while drawing and displaying an image . The memory is under the GPU and cannot be accessed by the CPU directly. 
It is operated solely by the GPU. The frame buffer has a size of 1 MB and is treated as a space of 1024 pixels wide 


and 512 pixels high. Each "pixel" has the size of one word (16 bit). It is not treated linearly like usual memory, but is 
accessed through coordinates, with an upper left corner of (0,0) and a lower right corner of (1023,511). 

When data is displayed from the frame buffer, a rectangular area is read from the specified coordinate 
within this memory. The size of this area can be chosen from several hardware defined types. Note that these 
hardware sizes are only valid when the X and Y stop/start registers are at their default values. This display area can 
be displayed in two color formats, being 15bit direct and 24bit direct. The data format of one pixel is as follows. 


15-bit direct display 


[M| ве | | Green | | Red | 


15 14 109 54 0 


This means each color has a value of 0-31. The MSB of a pixel (M) is used to mask the pixel. 


24-bit direct display 
The GPU can also be set to 24bit mode, in which case 3 bytes form one pixel, 1 byte for each color. Data in 
this mode is arranged as follows: 


| Gg | в [| RI j| во | в | сі | 


15 87 015 87 015 87 0 


Thus 2 display pixels are encoded in 3 frame buffer pixels. They are displayed as follows: [КО,СО,ВО] 
[R1,G1,B1]. 


Primitives. 
A basic figure which the GPU can draw is called a primitive, and it can draw the following: 


e Polygon 
The GPU can draw 3 point and 4 point polygons. Each point of the polygon specifies a point in the frame 
buffer. The polygon can be also be gourad shaded. The correct order of vertices for 4 point polygons is as follows 


1 2 


= 


4 


чә 


А 4 point polygon is processed internally as two 3 point polygons. also note when drawing a polygon the 
GPU will not draw the right most and bottom edge. So a (0,0)-(32,32) rectangle will actually be drawn as (0,0)- 
(31,31). Make sure adjoining polygons have the same coordinates if you want them to touch each other!. 


e Polygon with texture 

A primitive of this type is the same as above, except that a texture is applied. Each vertex of the polygon maps 
to a point on a texture page in the frame buffer. The polygon can be gourad shaded. 

Because a 4 point polygon is processed internally as two 3 point polygons, texture mapping is also done 
independently for both halves. This has some annoying consequences. 


е Rectangle 
A rectangle is defined by the location of the top left corner and its width and height. Width and height can be either 
free, 8*8 ог 16*16. It's drawn much faster than a polygon, but gourad shading is not possible. 


e брге 

A sprite is a textured rectangle, defined as a rectangle with coordinates on a texture page. Like the rectangle is 
drawn much faster than the polygon equivalent. No gourad shading possible. Even though the primitive is called a 
sprite, it has nothing in common with the traditional sprite, other than that it's a rectangular piece of graphics. Unlike 
the PSX sprite, the traditional sprite is NOT drawn to the bitmap, but gets sent to the screen instead of the actual 
graphics data at that location at display time. 


e Line 
A line is a straight line between 2 specified points. The line can be gourad shaded. A special form is the polyline, for 
which an arbitrary number of points can be specified. 


e Dot 
The dot primitive draws one pixel at the specified coordinate and in the specified color. It is actually a special form 
of rectangle, with a size of 1x1. 


Textures 

A texture is an image put on a polygon or sprite. It is necessary to prepare the data beforehand in the frame 
buffer. This image is called a texture pattern. The texture pattern is located on a texture page which has a standard 
size and is located somewhere in the frame buffer, see below. The data of a texture can be stored in 3 different 
modes 


е 15-01 direct mode 


[s| ве | бе | Rea — | 


15 14 109 54 0 


This means each color has a value of 0-31. The MSB of a pixel (S) is used to specify it the pixel is semi 
transparent or not. More on that later. 


e 8bit CLUT mode, 
Each pixel is defined by 8bits and the value of the pixel is converted to a 15-bit color using the CLUT(color 
lookup table) much like standard VGA pictures. So in effect you have 256 colors which are in 15bit precision. 


IO is the index to the CLUT for the left pixel, I1 for the right. 


e  4-bit CLUT mode, 
Same as above except that only 16 colors can be used. Data is arranged as follows: 


15 1211 87 43 0 


IO is first drawn to the left to I3 to the right. 


e Texture Pages 

Texture pages have a unit size of 256*256 pixels, regardless of color mode. This means that in the frame buffer 
they will be 64 pixels wide for 4bit CLUT, 128 pixels wide for 8bit CLUT and 256 pixels wide for 15-bit direct. The 
pixels are addressed with coordinates relative to the location of the texture page, not the frame buffer. So the top left 


texture coordinate on a texture page is (0,0) and the bottom right one is (255,255). The pages can be located in the 
frame buffer on X multiples of 64 and Y multiples of 256. More than one texture page can be set up, but each 
primitive can only contain texture from one page. 


e Texture Windows 

The area within a texture window is repeated throughout the texture page. The data is not actually stored all over 
the texture page but the GPU reads the repeated patterns as if they were there. The X and Y 
and H and W must be multiples of 8. 


e CLUT (Color Lookup Table) 

The CLUT is a the table where the colors are stored for the image data in the CLUT modes. The pixels of those 
images are used as indexes to this table. The CLUT is arranged in the frame buffer as a 256x1 image for the 8bit 
CLUT mode, and a 16x1 image for the 4bit CLUT mode. Each pixel as a 16 bit value, the first 15 used of a 15 bit 
color, and the 16th used for semi-transparency. The CLUT data can be arranged in the frame buffer at X multiples of 
16 (X=0,16,32,48,etc) and anywhere in the Y range of 0-511. More than one CLUT can be prepared but only one 
can be used for each 

primitive. 


e Texture Caching 

If polygons with texture are displayed, the GPU needs to read these from the frame buffer. This slows down the 
drawing process, and as a result the number of polygons that can be drawn in a given time span. To speed up this 
process the GPU is equipped with a texture cache, so a given piece of texture needs not to be read multiple times in 
succession. The texture cache size depends on the color mode used for the textures. In 4-bit CLUT mode it has a size 
of 64x64, in 8-bit CLUT it's 32x64 and in 15-bit direct is 32x32. A general speed up can be achieved by setting up 
textures according to these sizes. For further speed gain a more precise knowledge of how the cache works is 
necessary. 


Cache blocks 
The texture page is divided into non-overlapping cache blocks, each of a unit size according to color mode. 
These cache blocks are tiled within the texture page. 





- Cache entries 
Each cache block is divided into 256 cache entries, which are numbered sequentially, and are 8 bytes wide. 
So a cache entry holds 16 4-bit CLUT pixels 8 8-bit CULT pixels, or 4 15bitdirect pixels. 


4-bit and 8-bit CLUT 





The cache can hold only one cache entry by the same number, so if for example, а 
piece of texture spans multiple cache blocks and it has data on entry 9 of block 1, but also on entry 9 of block 2, 
these cannot be in the cache at once. 


Rendering options 
There are 3 modes which affect the way the GPU renders the primitives to the frame buffer. 


e Semi Transparency 

When semi transparency is set for a pixel, the GPU first reads the pixel it wants to write to, and then calculates 
the color it will write from the 2 pixels according to the semi-transparency mode selected. Processing speed is lower 
in this mode because additional reading and calculating are necessary. There are 4 semi-transparency modes in the 
GPU. 


B= the pixel read from the image in the frame buffer, F = the half transparent pixel 


10xB+05xF 
10xB+10xF 
10xB-10xF 
1.0 x B + 0.25 x F 


A new semi transparency mode can be set for each primitive. For primitives without texture semi- transparency 
can be selected. For primitives with texture semi transparency is stored in the MSB of each pixel, so some pixels can 
be set to STP others can be drawn opaque. For the CLUT modes the STP bit is obtained from the CLUT. So if a 
color index points to a color in the CLUT with the MSB set, it will be drawn semi transparent. 


When the color is black(BGR=0), STP is processed different from when it's not black (BGR<>0). The table below 
shows the differences: 


Transparency Processing (bit 1 of command packet) 


BGR off 


ooo | о о  —— J Transparent | | | Transparent | 
Non-transparent Non-transparent 


Non-transparent Non-transparent 
Non-transparent 





e Shading 

The GPU has a shading function, which will scale the color of a primitive to a specified brightness. There are 2 
shading modes: Flat shading, and gourad shading. Flat shading is the mode in which one brightness value is specified 
for the entire primitive. In gourad shading mode, a different brightness value can be given for each vertex of a 
primitive, and the brightness between these points is automatically interpolated. 


e Mask 

The mask function will prevent to GPU to write to specific pixels when drawing in the frame buffer. This means 
that when the GPU is drawing a primitive to a masked area, it will first read the pixel at the coordinate it wants to 
write to, check if it's masking bit is set, and if so refrain from writing to that particular pixel. The masking bit is the 
MSB of the pixel, just like the STP bit. To set this masking bit, the GPU provides a mask out mode, which will set 
the MSB of any pixel it writes. If both mask out and mask evaluation are on, the GPU will not draw to pixels with set 
MSB's, and will draw pixels with set MSB's to the others, these in turn becoming masked pixels. 


Drawing Environment 
The drawing environment specifies all global parameters the GPU needs for drawing primitives. 


e Drawing offset. 

This locates the top left corner of the drawing area. Coordinates of primitives originate to this point. So if 
the drawing offset is (0,240) and a vertex of a polygon is located at (16,20) it will be drawn to the frame buffer at 
(0+16,240+20). 


e Drawing clip area 
This specifies the maximum range the GPU draws primitives to. So in effect it specifies the top left and 
bottom right corner of the drawing area. 


e Dither enable 
When dither is enabled the GPU will dither areas during shading. It will process internally in 24 bit and 
dither the colors when converting back to 15-bit. When it is off, the lower 3 bits of each color simply get discarded. 


e Draw to display enable. 
This will enable/disable any drawing to the area that is currently displayed. 


e Mask enable 
When turned on any pixel drawn to the frame buffer by the GPU will have a set masking bit. (= set MSB) 


e Mask judgement enable 
Specifies if the mask data from the frame buffer is evaluated at the time of drawing. 


Display Environment. 
This contains all information about the display, and the area displayed. 


e Display area in frame buffer 
This specifies the resolution of the display. The size can be set as follows: 
Width: 256,320,384,512 or 640 pixels 
Height: 240 or 480 pixels 


These sizes are only an indication on how many pixels will be displayed using a default start end. These 
settings only specify the resolution of the display. 


e Display start/end. 

Specifies where the display area is positioned on the screen, and how much data gets sent to the screen. The 
screen sizes of the display area are valid only if the horizontal/vertical start/end values are default. By changing these 
you can get bigger/smaller display screens. On most TV's there is some black around the edge, which can be utilized 
by setting the start of the screen earlier and the end later. The size of the pixels is NOT changed with these settings, 
the GPU simply sends more data to the screen. Some monitors/TVs have a smaller display area and the extended size 
might not be visible on those sets.(Mine is capable of about 330 pixels horizontal, and 272 vertical in 320*240 
mode) 


e  [nterlace enable 
When enabled the GPU will display the even and odd lines of the display area alternately. It is necessary to set 
this when using 480 lines as the number of scan lines on a TV screen are not sufficient to display 480 lines. 


e 15bit/24bit direct display 
Switches between 15bit/24bit display mode. 


e Video mode 
Selects which video mode to use, which are either PAL or NTSC. 


GPU operation 
e СРО control registers. 

There are 2 32 bit IO ports for the GPU, which are at Ox1f80 1810 for GPU Data and 0х 1180 1814 for GPU 
control/Status. The data register is used to exchange data with the GPU and the control/status register gives the 
status of the GPU when read, and sets the control bits when written to. 


Control/Status Register 0x1f80 1814 


Status (Read) High 


3l 16 





WO WI 
Width 00 0 256 pixels 
01 0 320 
10 0 512 
11 0 640 
00 1 384 
Height 0 240 pixels 
1 480 
Video 0 NTSC 
1 РАГ, 
isrgb24 0 15-bit direct mode 
1 24-bit direct mode 
isinter 0 Interlace off 
1 Interlace on 
den 0 Display enabled 
1 Display disabled 
busy 0 GPU is Busy (i.e. drawing primitives) 
1 GPU is Idle 
img 0 Not Ready to send image (packet 5с0) 
1 Ready 
com 0 Not Ready to receive commands 
1 Кеаду 
dma 00 DMA off, communication through GPO 
01 Unknown 
10 DMA CPU -» GPU 
11 DMA GPU -» CPU 
Icf 0 Drawing even lines in interlace mode 


1 Drawing uneven lines in interlace mode 


Status (Read) Low 





0 Texture page X = tx*64 


4 z 
ty 0 0 Техшге раве Ү 
1 


256 
арг 00 0.5xB+0.5 x F Semi transparent state 
01 1.0xB+1.0x F 
10 1.0xB-1.0 x F 
11 1.0xB+0.25 x F 
tp 00 4-bit CLUT Texture page color mode 
01 8-bit CLUT 
10 15-bit 
dtd 0 Dither off 
1 Dither on 
dfe 0 off Draw to display area prohibited 
1 оп Draw to display area allowed 
md 0 off Do not apply mask bit to drawn pixels 
1 оп Apply mask bit to drawn pixels 
me 0 off Draw over pixel with mask set 
1 оп No drawing to pixels with set mask bit. 
Control (Write) 


A control command is composed of one word as follows: 


command parameter 


31 1615 0 


The composition of the parameter is different for each command. 


e Reset GPU 
command 0x00 
parameter 0x000000 


Description Resets the GPU. Also turns off the screen. (sets status to $14802000) 


e Reset Command Buffer 


command 0х01 

parameter 0x000000 

Description Resets the command buffer. 
е Reset IRQ 

command 0x02 

parameter 0x000000 


Description Resets the IRQ. 


e Display Enable 


command 0x03 
parameter 0x000000 Display disable 
0x000001 Display enable 
description Turns on/off display. Note that a turned off screen still gives the flicker of NTSC on a pal screen if 


NTSC mode is selected.. 


e DMA setup. 
command 0x04 
parameter 0x000000 DMA disabled 


0х000001 Unknown DMA function 


0x000002 DMA CPU to GPU 
0x000003 DMA GPU to CPU 
description Sets DMA direction. 


e Start of display area 


command 0x05 
parameter bit 0х00-0х09 X (0-1023) 

bit Ox0a-0x12 Y (0-512) = Ү<<10 + X 
description Locates the top left corner of the display area. 


e Horizontal Display range 


command 0x06 
parameter bit Ox00-0x0b ХІ (Ox1f4-OxCDA) 
bit 0х0с-0х17 X2 = ХІ+Х2<<12 
description Specifies the horizontal range within which the display area is displayed. The display is relative to 


the display start, so X coordinate 0 will be at the value in X1. The display end is not relative to the display start. The 
number of pixels that get sent to the screen in 320 mode are (X2-X1)/8. How many actually are visible depends on 
your TV/monitor. (normally $260-$c56) 


e Vertical Display range 


command 0x07 
parameter bit 0х00-0х09 ҮТ 
bit Охба-Ох14 Y2 = Y1+Y2<<10 
description Specifies the vertical range within which the display area is displayed. The display is relative to the 


display start, so Y coordinate 0 will be at the value in Y1. The display end is not relative to the display start. The 
number of pixels that get sent to the display are Y2-Y1, in 240 mode. (Not sure about the default values, should be 
something like NTSC 5010-5100, PAL 5023-5123) 


e Display mode 


command 0x08 
parameter bit Ox00-0x01 Width 0 
bit 0x02 Height 
bit 0x03 Video mode: See above 
bit 0x04 Isrgb24 
bit 0x05 Isinter 
bit 0x06 улаа! 
bit 0x07 Reverse flag 
description Sets the display mode. 
e Unknown 
command 0x09 
parameter 0x000001 ?? 
description Used with value $000001 
e СРО Info 
command 0x10 
parameter 0x000000 
0x000001 
0x000002 
0x000003 Draw area top left 
0x000004 Draw area bottom right 
0x000005 Draw offset 


0x000006 


0х000007 GPU Type, should return 2 for a standard GPU description. Returns requested 
info. Read result from GPO. 0,1 seem to return draw area top left also 6 seems to return draw offset too. 


e 7922 

command 0x20 

parameter 222222? 

description Used with value $000504 


Command Packets, Data Register 
Primitive command packets use an 8 bit command value which is present in all packets. They contain a 3 bit 
type block and a 5 bit option block of which the meaning of the bits depend on the type. layout is as follows: 


Type 

000 GPU command 

001 Polygon primitive 

010 Line primitive 

011 Sprite primitive 

100 Transfer command 

111 Environment command 


Configuration of the option blocks for the primitives is as follows: 


Polygon 









[09 p.t В 
7 


6 5 4 3 2 1 0 





| Sie | | | TME | ABE | 0 | 
7 4 3 2 






6 5 1 0 
ПР 0 Flat Shading 
1 Gourad Shading 
VTX 0 3 vertex polygon 
1 4 vertex polygon 
TME 0 Texture mapping off 
1 Texture mapping оп 
ABE 0 Semi transparency off 
1 Semi transparency on 
ТСЕ 0 Brightness calculation at time of texture mapping on 
1 off. (draw texture as is) 
Size 00 Free size (Specified by W/H) 
01 1x 1 


10 8х 8 


11 16 х 16 
PLL 0 Single line (2 vertices) 
1 Polyline (n vertices) 


e Color information 
Color information is forwarded as 24-bit data. It is parsed to 15-bit by the GPU. 


Layout as follows: 


| Ble | | бе (| — Red | 


23 1615 87 0 


e Shading information. 

For textured primitive shading data is forwarded by this packet. Layout is the same as for color data, the RGB 
values controlling the brightness of the individual colors ($00-$7f). A value of $80 in a color will take the former 
value as data. 


| Ble | | бе (| | Red | 


23 1615 87 0 


* Texture Page information 
The Data is 16 bit wide, layout is as follows: 


Eu eas o occ ABS Te 





EAS. prat 08 ГЕ ШШЕ ШЫ ББ БЫ [БОЕ ШЇ ЛЕ 


TX O-Oxf X*64 t texture page x coordinate 
TY 0 0 texture page y coordinate 
1 256 
АВК 0 0.5xB+0.5 xF Semi transparency mode 
1 1.0xB+1.0x F 
2 1.0xB-1.0 x F 
3 1.0xB+0.25 x F 
TP 0 4-bit CLUT 
1 8-bit CLUT 
2 15-bit direct 
e CLUT-ID 


Specifies the location of the CLUT data. Data is 16-bits. 


Y coordinate 0-511 X coordinate X/16 





ЕЕЕ ИЕ Е De ШИ ТЕЕ ЫИ ЕШ ЖЕ ПОЇ Б В 07.) 


Abbreviations in packet list 


BGR  Color/Shading info see above. 

xn,yn 16 bit values of X and Y in frame buffer. 
un,vn 8 bit values of X and Y in texture page 
tpage texture page information packet, see above 


«аш CULT ID, see above. 


Packet list 

The packets sent to the GPU are processed as a group of data, each one word wide. The data must be 
written to the GPU data register ($11801810) sequentially. Once all data has been received, the GPU 
starts operation. 


Overview of packet commands: 

e Primitive drawing packets 

0x20 monochrome 3 point polygon 
0x24 textured 3 point polygon 

0x28 monochrome 4 point polygon 
Ox2c textured 4 point polygon 

0х30  gradated 3 point polygon 
0х34  gradated textured 3 point polygon 
0x38  gradated 4 point polygon 

Ox3c  gradated textured 4 point polygon 
0x40 monochrome line 

0x48 monochrome polyline 

0x50 gradated line 

0x58  gradated line polyline 

0x60 rectangle 

0x64 sprite 

0x68 dot 

0x70 8*8 rectangle 

0x74 8*8 sprite 

0x78 16*16 rectangle 

0х7с 16516 sprite 

e GPU command & Transfer packets 
0x01 clear cache 

0x02 frame buffer rectangle draw 
0x80 move image in frame buffer 
Oxa0 send image to frame buffer 
Охс0 сору image from frame buffer 
e Draw mode/environment setting packets 
Oxel draw mode setting 

Oxe2 texture window setting 

Oxe3 set drawing area top left 

Oxe4 set drawing area bottom right 
Oxe5 drawing offset 

Oxe6 | mask setting 


Packet Descriptions 
e Primitive Packets 


0x20 monochrome 3 point polygon 










BGR Command + Color 
CLUT CULT ID + texture coordinates vertex 0 
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ud 
ul Texture page + texture coordinates vertex 1 
u2 [Texture coordinates vertex 2 












BGR 







BGR (Command - Color Vertex 0 
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y3 x3 Vertex 3 


|... | уз | v3 _ Гехшге coordinates vertex 3 


0x30 gradated 3 point polygon 
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0x58 gradated polyline 


Order B1 | 24p3 15 sy 0 
3 | | в — (дМоҮеесхї | 
|. 5 ро 1] вю » » Color Vertex2 | 
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|. | , BGR Color Vertexn | 
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Any number of points can be entered, end with termination code. 





0x60 Rectangle 


BGR (Command + Color 
| x upper left corner location 
| 3 | | weight and width 




















2 Р у |  -» mx  pegpelefcomerloaton | | | | | | | | 
3 | CLT — | v | u CULT ID + texture coordinates page yx | 
4 |... [| ^ ow peightand width | 





(Command + Color 


location 
















0x70 8x8 Rectangle 







Order Ві | 2403 15 87.0 
| 1 | 00 | 
| 21 | у 15 location 





0x74 8x8 Sprite 


Order B1 | 24p3 | 1015 87.0 
ЕО Ш 22 Ш СИЕ 
| хо Юм 

ШЕ АП СДС ОК СН ПНЯ ОН 


0x78 16x16 Rectangle 


Order. 51 2403 | 1615 sy 0 
| | 078 | | ВОЁ. (Command + Color 
| 2 | y | хо м 


0х7с 16х16 5ргїїе 


бып 2403 | 1615 sy 0 
| 1 | 0574 | | вок  — Command + Color 


© 


















© 











© 





2 x location 


y 
CLUT | v | u CUT ID + texture coordinates page у,х 


GPU command & Transfer packets 


0x01Clear cache 


Order Ві | 2403 15 О 
1 | O0 еее | 









о 





0x02 frame buffer rectangle draw 











(Command + Color 
Ух upper left corner location 
| | л | ме feightand width 


Fills the area in the frame buffer with the value in RGB. This command will draw without regard to drawing 
environment settings. Coordinates are absolute frame buffer coordinates. Max width is Ox3ff, max height is Ox 1ff. 




















Transfers data from main memory to frame buffer If the number of pixels to be sent is odd, an extra should 
be sent. (32 bits per packet) 









[1 [| ooi | Reset command buffer (write to ОРІ or GPO) | 
BGR 









Transfers data from frame buffer to main memory. Wait for bit 27 of the status register to be set before 
reading the image data. When the number of pixels is odd, an extra pixel is read at the end.(because on packet is 32 
bits) 


Draw mode/environment setting packets 

Some of these packets can also be by primitive packets, in any case it is the last packet of either that the 
GPU received that is active. so if a primitive sets tpage info, it will over write the existing data, even if it was sent by 
an Oxe? packet. 


зі — 295 ujoj 9 в % s p о 





[ Ox | || аа | tp | ab | у | tx | 


See above for explanations 


It seems that bits 11-13 of the status register can also be passed with this command on some GPU's other 
than type 2. (1.e. Command 0x10000007 doesn't return 2) 


0хе2 texture window setting 
2019 1514 о 


twy tww twh 





twx Texture window X, (twx*8) 
twy Texture window Y, (twy*8) 
tww Texture window width, 256-(tww*8) 
twh Texture window height, 256-(twh*8) 


0хе3 set drawing area top left 





Sets the drawing area top left corner. X &Y are absolute frame buffer coordinates. 


ох [| су У | X | 


Sets the drawing area bottom right corner. X &Y are absolute frame buffer coordinates. 





0хе5 drawing offset 


1 2423 2013 1110 


| Ox5 | || ОУ | ОХ | 


Offset Y =у<< 11 
Sets the drawing area offset within the drawing area. X&Y are offsets in the frame buffer. 


Охеб drawing offset 





1 2423 | л. СА Б 
| Ox6 | || Mask2 | Maski | 





Mask1 Set mask bit while drawing. 1 = on 
Mask2 Do not draw to mask areas. 1= on 

While mask1 is on, the GPU will set the MSB of all pixels it draws. While mask2 is on, the GPU will not 
write to pixels with set MSB's 


DMA and the GPU 

The GPU has two DMA channels allocated to it. DMA channel 2 is used to send linked packet lists to the 
GPU and to transfer image data to and from the frame buffer. DMA channel 6 is sets up an empty linked list, of 
which each entry points to the previous (i.e. reverse clear an OT.) 


DMA Second Memory Address Register (02 МАРК) 0x1f80 10a0 


3l 0 


MADR 


MADR Pointer to the virtual address the DMA will start reading from/writing to. 


DMA Second Block Control Register (D2_BCR) 0x1f80_10a4 
0 
16 16 


BA Amount of blocks 
BS Block size (words) 


Sets up the DMA blocks. Once started the DMA will send BA blocks of BS 


words. Don't set a block size larger then $10 words, as the command buffer 
of the GPU is 64 bytes. 


DMA Second Channel Control Register (D2_CHCR) 0x1f80_10a8 


31 0 
ыы о JDR] 
7 1 13 1 1 8 1 

TR 0 Мо ОМА transfer busy. 

1 Start ОМА transfer/DMA transfer busy. 
LR 1 Transfer linked list. (GPU only) 
CO 1 Transfer continuous stream of data. 
DR 1 Direction from memory 

0 Direction from memory 


This configures the DMA channel. The DMA starts when bit 18 is set. DMA is finished as soon as bit 18 is 
cleared again. To send or receive data to/from УКАМ send the appropriate GPU packets first (Оха0/0хс0) 


DMA Sixth Memory Address Register (D6_MADR) 0x1f80_10e0 


31 


MADR 


MADR Pointer to the virtual address if the last entry. 


© 


ОМА Sixth Block Control Register (06 ВСК) 0x1f80 10e4 
31 0 
BC Number of list entries. 
ОМА Sixth Channel Control Register (06 CHCR) 0x1f80 10e8 
31 0 
| 0 [R| 0... [исо 0  |гк| 
7 1 13 1 1 8 1 
ТЕ 0 Мо ОМА transfer busy. 
1 Start ОМА transfer/DMA transfer busy. 
LR 1 Transfer linked list. (GPU only) 
CO 1 Transfer continuous stream of data. 
DR 1 Direction from memory 
0 Direction from memory 


This configures the DMA channel. The DMA starts when bit 18 is set. DMA is finished as soon as bit 18 is 
cleared again. To send or receive data to/from VRAM send the appropriate GPU packets first (Оха0/Охс0) When this 
register is set to $11000002, the DMA channel will create an empty linked list of D6 BCR entries ending at the 
address іп 06 МАРК. Each entry has a size of 0, and points to the previous. The first entry is So if 06 МАРК = 
$80100010, D6 BCR-$00000004, and the DMA is kicked this mwill result in a list looking like this: 


0x8010 0000 OxOOff ffff 

0x8010 0004 0x0010 0000 
0x8010 0008 0x0010 0004 
0x8010 000c 0х0010 0008 
0х8010 0010 0x0010 000c 


DMA Primary Control Register (DPCR) 0x1f80 1010 


4 


Each register has a 4 bit control block allocated in this register. 
Bit3 1= DMA Enabled 


2 Unknown 
1 Unknown 
0 Unknown 


Bit 3 must be set for a channel to operate. 


4 4 4 4 4 4 4 


Соттоп GPU functions, step by step. 
e = Initializing the GPU. 
First thing to do when using the GPU is to initialize it. To do that take the following steps: 


1 - Reset the GPU (GP1 command $00). This turns off the display as well. 
2 - Set horizontal and vertical start/end. (GP1 command $06, $07) 

3 - Set display mode. (GP1 command $08) 

4 - Set display offset. (GP1 command $05) 

5 - Set draw mode. (GPO command $e1) 

6 - Set draw area. (GPO command 863, $e4) 

7 - Set draw offset. (GPO command $e5) 

8 - Enable display. 


e Sending a linked list. 


The normal way to send large numbers of primitives is by using a linked list DMA transfer. This list is built up 
of entries of which each points to the next. One entry looks like this: 


dw $nnYYYYYY ; nn = the number of words in the list entry 
; YYYYYY = address of next list entry & OxOOff НІ 
1 dw .. ; here goes the primitive. 
2 dw А 
Е dw .. : 
nn-1 dw .. : 
пп dw .. : 


The last entry in the list should have Oxffffff as pointer, which is the terminator. As soon as this value is 
found DMA is ended. If the entry size is set to 0, no data will be transferred to the GPU and the next entry is 
processed. 


To send the list do this: 

1 - Wait for the GPU to be ready to receive commands. (bit $1c == 1) 

2 - Enable DMA channel 2 

3 - Set GPU to DMA CPU->GPU mode. ($04000002) 

3 - Set D2_MADR to the start of the list 

4 - Set D2_BCR to zero. 

5 - Set D2_CHCR to link mode, memory->GPU and DMA enable. ($01000401) 


e Uploading Image data through DMA. 
To upload an image to VRAM take the following steps: 


1 - Wait for the GPU to be idle and DMA to finish. Enable DMA channel 2 if necessary. 
2 - Send the 'Send image to VRAM' primitive. (You can send this through DMA if you want. Use the linked list 
method described above) 
3 - Set DMA to CPU->GPU ($04000002) (if you didn't do so already in the previous step) 
4 - Set D2 MADR to the start of the list 
5- Set D2 BCR with: bits 31-16 2 Number of words to send (H*W /2) 
bits 15- 0 = Block size of 1 word. ($01) 
if H*W 1s odd, add 1. (Pixels are 2 bytes, send an extra blank pixel in case of an odd amount) 


6 - Set D2 CHCR to continuous mode, memory -» GPU апа РМА enable. ($01000201) 


Note that H, W, X and Y are always in frame buffer pixels, even if you send image data in other formats. 

You can use bigger block sizes if you need more speed. If the number of words to be sent is not a multiple of the 
block size, you'll have to send the remainder separately, because the GPU only accepts an extra halfword if the 
number of pixels is odd. (i.e. of the last word sent, only the low half word is used.) Also take care not to use block 
sizes bigger than 0x10, as the buffer of the GPU is only 64 bytes (20x10 words). 


e Waiting to send commands 


You can send new commands as soon as DMA has ceased and the GPU is ready. 
1 - Wait for bit $18 to become 0 in D2 CHCR 
2 - Wait for bit $1c to become 1 in СРІ. 


The Geometry Transformation Engine (GTE) 

The Geometry Transformation Engine (GTE) is the heart of all 3D calculations on the PSX. The GTE can 
perform vector and matrix operations, perspective transformation, color equations and the like. It is much faster than 
the CPU on these operations. It is mounted as the second coprocessor and as such is no physical address in the 
memory of the PSX. АП control is done through special instructions. 


Basic mathematics 

The GTE is basicly an engine for vector mathematics. The basic representation of a point(vertex) in 3d 
space is through a vector of the sort [X, Y,Z]. In GTE operation there's basicly two kinds of these, vectors of variable 
length and vectors of a unit length of 1.0, called normal vectors. The first is used to decribe a locations and 
translations in 3d space, the second to describe a direction. 

Rotation of vertices is performed by multiplying the vector of the vertex with a rotation matrix. The rotation 
matrix is a 3x3 matrix consisting of 3 normal vectors which are orthogonal to each other. (It's actually the matrix 
which describes the coordinate system in which the vertex is located in relation to the unit coordinate system. See a 
math book for more details.) This matrix is derived from rotation angles as follows: 


sn = sin(n), cn = cos(n) 


Rotation angle А about X axis: 


| 1 0 0 
| 0 cA -sA| 
| 0 sA САД| 


Rotation angle B about Y axis: 


| cB 0 58| 
0 1 0 
-88 0 cB 





сб —SC 0 
850 cC 0 
0 0 ji 


Rotation angle C about Z axis: 





Rotation about multiple axis can be done by multiplying these matrices with eachother. Note that the order 
in which this multiplication is done *IS* important. The GTE has no sine or cosine functions, so the calculation of 
these must be done by the CPU. 

Translation is the simple addition of two vectors, relocating the vertex within its current coordinate system. 
Needless to say the order in which translation and rotation occur for a vector is important. 


Brief Function descriptions 


RTPS/RTPT 
Rotate, translate and perpective transformation. 

These two functions perform the final 3d calculations on one or three vertices at once. The points are first 
multiplied with a rotation matrix(R), and after that translated(TR). Finally a perspective transformation is applied, 
which results in 2d screen coordinates. It also returns an interpolation value to be used with the various depth cueing 
instructions. 


MVMVA 
Matrix & Vector multiplication and addition. 

Multiplies a vector with either the rotation matrix, the light matrix or the color matrix and then adds the 
translation vector or background color vector. 


DCPL 
Depth cue light color 

First calculates a color from a light vector(normal vector of a plane multiplied with the light matrix and zero 
limited) and a provided RGB value. Then performs depth cueing by interpolating between the far color vector and 
the newfound color. 


DPCS/DPCT 
Depth cue single/triple 
Performs depth cueing by interpolating between a color and the far color vector on one or three colors. 


INTPL 
Interpolation 
Interpolates between a vector and the far color vector. 


SQR 
Square 
Calculates the square of a vector. 


NCS/NCT 
Normal Color 

Calculates a color from the normal of a point or plane and the light sources and colors. The basic color of 
the plane or point the normal refers to is assumed to be white. 


NCDS/NCDT 
Normal Color Depth Cue. 
Same as NCS/NCT but also performs depth cueing (like DPCS/DPCT) 


NCCS/NCCT 
Same NCS/NCT, but the base color of the plane or point is taken into account. 


CDP 
A color is calculated from a light vector (base color is assumed to be white) and depth cueing is performed 
(like DPCS). 


CC 


A color is calculated from a light vector and a base color. 


NCLIP 
Calculates the outer product of three 2d points.(ie. 3 vertices which define a plane after projection.) 


The 3 vertices should be stored clockwise according to the visual point: 


If this is so, the result of this function will be negative if we are 
facing the backside of the plane. 


AVSZ3/AVSZA 


Adds 3 or 4 z values together and multplies them by a fixed point value. This value is normally chosen so 
that this function returns the average of the z values (usually further divided by 2 or 4 for easy adding to the OT) 


OP 
Calculates the outer product of 2 vectors. 
GPF 
Multiplies 2 vectors. Also returns the result as 24bit rgb value. 
GPL 
Multiplies a vector with a scalar and adds the result to another vector. Also returns the result as 24bit rgb 
value. 
Instructions 


The CPU has six special load and store instructions for the GTE registers, and an instruction to issue 
commands to the coprocessor. 


rt CPU register 0-31 

gd GTE data register 0-31 

gc GTE control register 0-31 

imm 16 bit immediate value 

base CPU register 0-31 

imm(base) address pointed to by base + imm. 
b25 25 bit wide data field. 


LWC2 gd, imm(base) stores value at imm(base) in GTE data register gd. 
SWC2 gd, imm(base) stores СТЕ data register at imm(base). 


МТС2 rt, gd stores register rt in GTE data register gd. 
MFC2 rt, gd stores GTE data register gd in register rt. 
СТС2 rt, gc stores register rt in GTE control register gc. 
CFC2 rt, gc stores GTE control register in register rt. 
COP2 b25 Issues a GTE command. 


СТЕ load and store instructions have a delay of 2 instructions, for any СТЕ commands or operations accessing that 
register. 


Programming the GTE. 
Before use the GTE must be turned on. The GTE has bit 30 allocated to it in the status register of the 
system control coprocessor (copO). Before any СТЕ instruction is used, this bit must be set. 


GTE instructions and functions should not be used in 
- Delay slots of jumps and branches 
- Event handlers or interrupts. 


If an instruction that reads a GTE register or a GTE command is executed before the current GTE command is 
finished, the CPU will hold until the instruction has finished. The number of cycles each GTE instruction takes is in 
the command list. 


Registers. 

The GTE has 32 data registers, and 32 control registers, each 32 bits wide. The control registers are 
commonly called Cop2C, while the data registers are called Cop2D. The following list describes their common use. 
The format is explained later on. 


Кате 
RIIRI2 
RI3R21 
R22R23 


1 
2 
3 
4 
5 


0 
ae ий 
fe 
| 3 | 831832 
a 
EET A 
БЕЗЕ 
[с=с =] 
222810 
297: 
ПЕ Й 
2231: 
[== 


R33 Rotation matrix elements 3 to 3 


х 


Translation vector X 


< 


Translation vector Y 


7 


< 


Translation vector 7. 


11112 Light source matrix elements 1 to 1, 1 to 2 
L13L21 Light source matrix elements | to 3, 2 to 1 
L22L23 Light source matrix elements 2 to 2, 2 to 3 


58 


10 
по | 131132 
133 
LRILR2 
R3LGI 
G2LG3 
LBILB2 |Light color matrix source 1&2 blue comp 

LB3 
| 22 
Н 


= 
N 


==] 
221520 


2 


E 
И С1| нан 
ча р | | w |w |w Ax 
К | gage: cede 


— 
чә 


— 
сл 


E 


1 
1 
1 
1 


2218: 
as 
Pes A | 
РЕ 
[21 | 


2 


ы 
— 
т] 


2583 
ZSF4 
3 


FLAG _ Returns any calculation errors.(See below) 


4 
6 
7 
9 
0 

22 
4 
5 
6 
7 
9 
0 





Control Register format 
The GTE uses signed, fixed point registers for mathematics. The following is a bit-wise description of the 
registers. 


1 
Sign 


1 
Sign 





1 
Sign 


1 
Sign 





R33 
31 
0 
R33 
Sign 


integral part 
fractional part 


12 


111112 


Sign 


L13L21 
LB 





Sign | integral part | fractional part | Sign | integral part | fractional part 


1 
Sign 


1 
Sign 





L33 
31 
0 
L33 
Sign 


integral part 
fractional part 
1 
3 
12 


ign 


LRILR2 


1 
LRI LR2 


fractional part | Sign 
12 


LR3LLGI 





05 
- 
о 


integral part Sign | integral part 
pu uc | ogec г эу у. 1. у 


LR3 ІСІ 
3 3 





1 
Sign 


1 
Sign 





LB3 
31 
0 
LB3 
Sign 


integral part 
fractional part 
1 
3 
12 


i 


1 
i 


1 
i 


OFX 





05 
- 
о 


15 16 


971 

ЁС 
-F 
- |5 


о 


н 
integral part 
16 


DQA 


© 


DQA 
Sign 
integral part 
fractional part 
1 
7 
8 


ров 
31 


ров 
Sign 
integral part 
fractional part 
1 
7 
8 


7ЕЗ 
31 


ZF3 
Sign 
integral part 
fractional part 
1 
3 
12 


DZF4 
31 


ZF4 
Sign 
integral part 
fractional part 
1 
3 
12 


FLAGS 


Flags bit description. 


31 Logical sum of bits 30-23 and bits 18-13 

30 Calculation test result #1 overflow (2^43 or more) 

29 Calculation test result 42 overflow (2^43 or more) 

28 Calculation test result 43 overflow (2^43 or more) 

27 Calculation test result #1 underflow (less than -2^43) 

26 Calculation test result 42 underflow (less than -2^43) 

25 Calculation test result 43 underflow (less than -2^43) 

24 Limiter A1 out of range (less than O, or less than -2^15, or 2^15 or more) 
23 Limiter A2 out of range (less than 0, or less than -2^15, or 2^15 or more) 
22 Limiter АЗ out of range (less than 0, or less than -2^15, ог 2^15 or more) 
21 Limiter B1 out of range (less than 0, or 2^8 or more) 

20 Limiter B2 out of range (less than 0, or 2^8 or more) 

19 Limiter B3 out of range (less than 0, or 2^8 or more) 

18 Limiter C out of range (less than 0, or 2^16 or more) 

17 Divide overflow generated (quotient of 2.0 or more) 

16 Calculation test result 44 overflow (2^31 or more) 

15 Calculation test result 44 underflow (less than -2^31) 

14 Limiter D1 out of range (less than 2^10, or 2^10 or more) 

13 Limiter D2 out of range (less than 2^10, or 2^10 or more) 

12 Limiter E out of range (less than 0, or 2^12 or more) 

Data Registers 


Data registers consist of the other "half" of the GTE. Note in some functions format are different from the 
one that's given here. The numbers in the format fields are the signed, integer and fractional parts of the field. So 
1,3,12 means signed(1 bit), 3 bits integral part, 12 bits fractional part. 


Data Registers (Cop2D) 
[ 0. | УХУО | му | vyo | Ухо | 1,3,120r1,15,0 VecorüXandY | 
| 1 | vzo |w] о | VZO | 13120:1л50 VecorOZ | 
| з | Ул |w] 0 | уй | 1312 0г1150 VecoriZ — | 


| 5 | V2 || 0 | VZ2 [153,12 0r1,15,0 Vecor2Z | | 


RGB r/w | Code, R G,B 8 bits foreach |RGB value. Code is passed, but 
inot used in calculation 


| 7 рот | г | о | ото | 0150 | Average value. || 


IRO r/w Sign IRO 1, 3,12 Intermediate value 0. Format may 
differ 


[9 | IRI | ми | Sign | № | — L312 [ntermediate value 1. Format may | 





IR2 r/w Sign IR2 1, 3,12 Intermediate value 2. Format may 
pA enc 
IR3 r/w Sign IR3 1, 3,12 Intermediate value 3. Format may 
Edi NS Е 
12 SXYO | r/w 5Х0 5Ү0 1,15,0 Screen XY coordinate FIFO (Note 
л з л ла 
5 


| 16 | szo [| 0 | szo | 0160  .SreenZFIFO(Note]) | 
| 17 | 821 || 0 | 571 | 0160  greenZFIFO | 
| 18 | SZ [| 0 | 572 | 0160  SfreenZFIFO | 
| 19 | 573 || 0 | SZ3 | 0160  SfgcreenZFIFO | 


11 





22 RGB2 | r/w | CD2,B2 GO,R2 8 bits each D2 is the bit pattern of currently 
executed function 


| 23 | RESI | - | | - рове | 0004 
| 28 |IRGB | м | 0 | IBIGIR | — Noe2 | Мого | 
| 29 |ORGB| г | 0 [08.06.08 | Мез | Noe3 ^ | 


30 LZCS У 17С5 1,31,0 eading zero count source data 
(Note 4) 


LZCR LZCR | | 660 | eading zero count result (Note 4) 


Note 1 


The SXYx, SZx and RGBx are first in first out registers (FIFO). The last calculation result is stored in the 
last register, and previous results are stored in previous registers. So for example when a new SXY value is obtained 
the following happens: 

SXY0 = SXYI 
SXY1 = SXY2 
SXY2 = SXYP 
SXYP = result. 





When writing a value to IRGB the following happens: 
IR1 = IR format converted to (1,11,4) 
IR2 = IG format converted to (1,11,4) 
ТЕЗ = IB format converted to (1,11,4) 





When writing a value to IRGB the following happens: 
IR = IR1>>7) &Ox1f 
ІС = (IR2>>7) &Ox1f 
ІВ = (IR3>>7) &Ox1f 


Note 4 
Reading LZCR returns the leading 0 count of LZCS if LZCS is positive and the leading 1 count of LZCS if 
LZCS is negative. 


GTE Commands. 

This part describes the actual calculations performed by the various GTE functions. The first line contains 
the name of the function, the number of cycles it takes and a brief description. The second part contains any fields 
that may be set in the opcode and in the third line is the actual opcode. See the end of the list for the fields and their 
descriptions. Then follows a list of all registers which are needed in the calculation under the 'in', and a list of 
registers which modified under the 'out' with a brief description and the format of the data. Next follows the 
calculation which is performed after initiating the function. The format field left is the size in which the data is 
stored, the format field on the right contains the format in which the calculation is performed. At certain points in the 
calculation checks and limitations are done and their results stored in the flag register, see the table below. They are 
identified with the code from the second column of the table directly followed by square brackets enclosing the part 
of the calculation on which the check is performed. The additional Lm identifier means the value is limited to the 
bottom or ceiling of the check if it exceeds the boundary. 


bit description 

31 Checksum. 

30 АІ Result larger than 43 bits and positive 

29 A2 Result larger than 43 bits and positive 

28 A3 Result larger than 43 bits and positive 

27 Al Result larger than 43 bits and negative 

26 A2 Result larger than 43 bits and negative 

25 A3 Result larger than 43 bits and negative 

24 ВІ Value negative(Im-1) or larger than 15 bits(Imz0) 
23 B2 Value negative(Im-1) or larger than 15 bits(Imz0) 
22 ВЗ Value negative(Im-1) or larger than 15 bits(Imz0) 
2] C1 Value negative or larger than 8 bits. 

20 C2 Value negative or larger than 8 bits. 

19 C3 Value negative or larger than 8 bits. 

18 D Value negative or larger than 16 bits. 

17 E Divide overflow. (quotient > 2.0) 

16 F Result larger than 31 bits and positive. 

15 F Result larger than 31 bits and negative. 

14 G1 Value larger than 10 bits. 

13 G2 Value larger than 10 bits. 

12H Value negative or larger than 12 bits. 


Command 
RTPS cop2 0x0180001 [Perspective transform 


Fields: None 

In: VO Vector to transform. [1,015,407] 
R Rotation matrix [1,3,12] 
TR Translation vector [1,:31,.0] 
H View plane distance [0,16,0] 
DOA Depth que interpolation values. [1,7,8] 
ров 11,7,81 


























ОЕХ Screen offset values. 1715716] 
OFY 1,15, 16] 
Out: SXY fifo Screen XY coordinates. (short) 1, 05,0 
SZ fifo Screen 7 coordinate. (short) 0,16,0] 
IRO Interpolation value for depth queing. 1:35.12 
ТВ1 Screen X (short) 1,450 
IR2 Screen Y (short) 115.0] 
ТЕЗ Screen 2 (short) 1,15,0] 
АС1 Screen X (long) 1; 31,07] 
AC2 Screen Y (long) 1,91,0 
AC3 Screen Z (long) 1; 32,0] 
Calculation: 
1,31,0 АС1=А1 [ТВХ + R11*VXO + R12*VYO + R13*V20] 1,31,12] 
131/0 AC2=A2 [TRY + R21*VXO + R22*VYO + R23*VZ0] 1,31,12] 
1,31,0 AC3=A3[TRZ + R31*VXO + R32*VYO + R33*V20] 1,31,12] 
1,15,0] IR1- Lm В1|МАС1| 1,31,0] 
1,15,0] IR2- Lm B2[MAC2] 1,31,0] 
1,15,0] IR3-» Lm B3[MAC3] 1,31,0] 
SZ0«-SZ1«-S22«-SZ73 
0,16,0] SZ3- Lm D(MAC3) 1,31,01 
5Х0«-5Х1«-5Х2, 5Ү0«-5Ү1«-5Ү2 
1,15,01 5Х2- Lm С11Е(ОЕХ + ІВІХ (Н/57)1) 1,27,16] 
1,15,0] SY2= Lm G2[F[OFY + IR2* (H/SZ)]] 1,27,16] 
1,31,0] MACO- F[DQB + род * (H/SZ) ] 1,19,241 
1,15,01 IRO= Lm H[MACO] 1,31,0] 











Notes: 
Z values are limited downwards at 0.5 * H. For smaller z values you'll have 
write your own routine. 











RTPT cop2 0x0280030 Perspective transform on 3 points 


Fields: None 











in: VO Vector to transform. 1:5:1:5:501| 
V1 1,15,01 
V2 1,15,0] 
R Rotation matrix 183; 221 
ТВ Translation vector 1, 31,0] 
H View plane distance 0,16,0] 
DOA Depth que interpolation values. 1,7,81 
ров 1,7,81 
OFX Screen offset values. 1,15,196; 
OFY 1,15,16] 

out: SXY fifo Screen XY coordinates. (short) 1, 15,0 
SZ fifo Screen Z coordinate. (short) ОЛ © О; 
IRO Interpolation value for depth queing. 1,9712] 
ТВ1 Screen X (short) 15:154.0] 
IR2 Screen Y (short) 1,15,0] 
ТЕЗ Screen 2 (short) 1. 15501 
МАСІ Screen X (long) 1, 3415-0 
MAC2 Screen Y (long) 1.317201 
МАСЗ Screen 7 (long) T1,31,0 

Calculation: Same аз RTPS, but repeats for V1 and V2. 





MVMVA| 8 | сор2 0х0400012 Multiply vector by matrix and vector addition. 


Fields: sf, mx, v, cv, lm 

in: VO/V1/V2/IR Vector vO, vl, v2 or [IRI1,IR2,IR3] 
R/LLM/LCM Rotation, light or color matrix. [1,3,12] 
TR/BK Translation or background color vector. 

out: [IR1,IR2,IR3] Short vector 


[MAC1,MAC2,MAC3] Long vector 


Calculation: 

MX = matrix specified by mx 
V = vector specified by v 
CV = vector specified by cv 


МАС1=А1[СУ1 + MX11*V1 + MX12*V2 + MX13*V3] 
MAC2-A2[CV2 + MX21*V1 + MX22*V2 + MX23*V3] 
МАСЗ=АЗ [СҮЗ + MX31*V1 + MX32*V2 + MX33*V3] 
IR1=Lm_B1 [МАСІ] 
IR2-Lm B2[MAC2] 
IR3-Lm B3[MAC3] 





Notes: 
The cv field allows selection of the far color vector, but this vector 
is not added correctly by the GTE. 

















| DPCL | 8 | cop20x0680029 [Depth Cue Color light 000000 

Fields: 

In: RGB Primary color. R,G,B, CODE 0,8,0] 
IRO interpolation value. ЭР. 
[IR1,IR2,IR3] Local color vector. 1,3512] 
CODE Code value from RGB. CODE [0,8,0] 
FC Far color. 142114 

Out: RGBn RGB fifo Rn,Gn,Bn,CDn [0,8,0] 
[IR1,IR2,IR3] Color vector 1,11,4] 
[MAC1,MAC2,MAC3] Color vector 1,27,4] 

Calculation: 

1,27,4] МАС1=А1 [В*ТВ1 + IRO*(Lm B1[RFC - К * IR1])] 1,27,16] 
1,27,4] MAC2=A2[G*IR2 + IRO*(Lm B1[GFC - С * IR2])] 1,27,16] 
1,27,4] MAC3=A3[B*IR3 + IRO*(Lm_B1[BFC - B * IR3])] 1,27,16] 
1,11,4]  ІК1-їт В1|МАС1| 1,27,4] 
1,11,4]  ІК2-їт B2[MAC2] 1,27,4] 
1,11,4] IR3=Lm_B3 [MAC3] 1,27,4] 
0,8,0] С40«-С41«-С42«- CODE 

0,8,0 RO«-R1«-R2«- Lm С1|МАС1| 1,27,4] 
0,8,0] G0«-G1«-G2«- Lm C2[MAC2] 1,27,4] 
0,8,0] BO«-B1«-B2«- Lm C3[MAC3] 1,27,4] 











Command 
cop2 0x0780010 |Depth Cueing 


In: IRO Interpolation value [1585.127] 
RGB Color R,G,B, CODE [0,8,0] 
FC Far color RFC, GFC, BFC [1,27,4] 


J 
"d 
С 
22 





Out: 


Calculations: 
1,27,4] МАС1=А1[( 
1,27,4] MAC2-A2[( 
1,27,4] 
1,11,4] IR1-Lm В1|МАС1| 
1,11,4] IR2=Lm_B2[MAC2] 
1,11,4] IR3=Lm_B3[MAC3] 
0,8,0] С40«-С41«-С42«- COD 





RGBn 
[IR1,IR2,IR3] 
[MAC1,MAC2,MAC3] 


RGB fifo 
Color vector 
Color vector 


В + IRO*(Lm_B1[RFC - R])] 
С + IRO*(Lm_B1[GFC - G])] 


MAC3=A3[(B + IRO*(Lm_B1[BFC - B])] 


RO«-R1«-R2«- Lm C1[ 
G0«-G1«-G2«- Lm C2[ 


[т] 


AC1] 
AC2] 





BO«-B1«-B2«- Lm C3[ 





AC3] 


Rn,Gn,Bn,CDn 


0,8,0] 
11347 
1.27547 


1,9. 16] 
1.27.16] 
1,237.16] 
1.927,4]] 
1,93, 471 
1.293411 


1,27,4] 
15.97; 4) 
1.9947 





[ 
[ 
[ 
1 
1 
1 


1m-0] 
1m-0] 
1m-0] 
m-0] 
m-0] 


m-0] 














DPCT 

Fields 

In: IRO Interpolation value 1433124 
RGBO, RGB1, RGB2 Colors in RGB fifo. Rn,Gn,Bn,CDn [0,8,0] 

FC Far color RFC,GFC,BFC 1,27,4] 

Out: RGBn RGB fifo Rn,Gn,Bn,CDn [0,8,0] 
[IR1,IR2,IR3] Color vector 1,11,4] 
[MAC1,MAC2,MAC3] Color vector 1,27,4] 

Calculations: 

1,27,4] MAC1=A1[RO+ IRO*(Lm В1ЇЇВЕС - ЕО1)1 1,27,16] [1m-0] 
1,27,4] МАС2=А2 [60+ IRO*(Lm_B1[GFC - G0])] 1,27,16] [1m-0] 
1,27,4] MAC3=A3[B0+ IRO*(Lm В1ЇЇВЕС - BO])] 1,27,16] [1m-0] 
17, 11,4] IR1-Lm В1|МАС11 1,27,4] [1m-0] 
1,11,4] IR2-Lm B2 [MAC2] 1,27,4] [1m-0] 
1,11,4] IR3-Lm ВЗ [MAC3] 1,27,4] [1m-0] 
0,8,0] С40«-С41«-С42«- CODE 
0,8,0] RO«-R1«-R2«- Lm С1|МАС1| 127,4] 
0,8,0] G0«-G1«-G2«- Lm C2[MAC2] 1,27, А 
0,8,0] BO«-B1«-B2«- Lm C3[MAC3] T;27,4] 
Performs this calculation 3 times, so all three RGB values have been 
replaced by the depth cued RGB values. 








Fields 

In: [IR1,IR2,IR3] Vector [1,3,12] 
IRO Interpolation value [1.3212] 
CODE Code value from RGB. CODE [0,8,0] 
FC Far color RFC,GFC,BFC [1,27,4] 

Out: RGBn RGB fifo Rn, Gn, Вп, Срп [0,8,0] 
[IR1, IR2, IR3] Color vector [1,11,4] 
[MAC1,MAC2,MAC3] Color vector [1,27,4] 


Calculations: 





MAC1=A1[IR1 + IRO*(Lm_B1[RFC - ТЕ1])] 
МАС2=А2 (152 + IRO*(Lm Bl1[GFC - IR2])] 
MAC3=A3[IR3 + IRO*(Lm_B1[BFC - IR3])] 


IR1=Lm_B1 [МАСІ] 
IR2-Lm B2[MAC2] 
IR3-Lm B3[MAC3] 


Cd0<-Cd1<-Cd2<- COD 


RO«-R1«-R2«- Lm C1[ 
G0«-Gl1«-G2«- Lm C2[ 


Gl 


AC1] 
AC2] 





BO«-B1«-B2«- Lm C3[ 





AC3] 





1,253.19] 
1.27. 1$] 
1,27,16] 
1274] 
1,23,4] 
1527,4] 


1,27,4] 
1,29 47 
1527,4] 


cop2 0x0A00428 


Fields: s 
in: 
ut: 


он 


Calculati 


(1,31,011 
(1,31,011 
(1,31,011 
(1,15,011 
(1,15,011 
(1,15,011 


Ё 

(ЇВ1,1В2,1БВ3| vector 
[IR1,IR2,IR3] vector^2 
[MAC1,MAC2,MAC3]  vector^2 








on: (left format sf=0, 

1,19,12] МАС1=А1 [ІВ1*ІВ1] 
1,19,12] MAC2-A2[IR2*IR2] 
1,19,12] MAC3=A3[IR3*IR3] 
1,3,12] IR1-Lm В1|МАС1| 
Те IR2-Lm В2 [МАС2] 
1. By 12.] IR3-Lm B3[MAC3] 


right format sf-1) 


[уюур О О ЕУ 3:12] 
|1:5:155:0111 235421 
[ЗЕ ОТ 219124 


[1,43,0 
[1,43,0 
[1,43,0 
[131,0 
[1,31,0 
[1,31,0 


1,392.12 
1,351.12 
1.90.12 
1,19,12 
1.16. 12 
1,19,12 


[1m-1] 
[1m-1] 
[1m-1] 


cop2 0x0C8041E 


Fields: 
n: 


H 


VO 
BK 
CODI 
LCM 
LLM 
RGBn 
[IR1,IR2,IR3] 
[MAC1,MAC2,MAC3] 





Gl 


Normal vector 
Background color 
Code value from RGB. 
Color matrix 

Light matrix 

RGB fifo. 

Color vector 

Color vector 


MAC1=A1[L11*VxX0 + L12*VYO + L13*V20] 
MAC2=A2 [L21*VX0 + L22*VYO + L23*VZ0] 





MAC3=A3[L31*VX0 + L32*VYO + L33*V20] 


IR1- Lm В1|МАС1| 
IR2- Lm B2[MAC2] 





IR3- Lm B3[MAC3] 
MAC1-A1 


RBK, GBK, BBK 
CODI 


[т] 





Rn,Gn,Bn,CDn 


ВВК + LR1*IR1 + LR2*IR2 + LR3*IR3] 


t 
MAC2-A2[GBK + LG1*IR1 + LG2*IR2 + LG3*IR3] 
МАСЗ=АЗ [ВВК + LB1*IR1 + LB2*IR2 + LB3*IR3] 


IR1- Lm B1[MACI1] 
IR2- Lm B2[MAC2] 








IR3- Lm B3[MAC3] 


С40«-С41«-С42«- COD 


RO«-R1«-R2«- Lm C1[ 


[т] 


AC1] 





G0«-Gl1«-G2«- Lm C2[ 
BO«-B1«-B2«- Lm C3[ 





AC2] 
AC3] 








1,19,24 
1,19,24 
1,19,24 
1,19,12 
1,19,12 
1,19,12 
1,19,24 
1,19,24 
1,19,24 
1,19,12 
1,19,12 
1,19,12 





1 99,4] 
1,55 5 
1, 29,4) 


[1m-1] 
[1m-1] 
[1m-1] 


[1m-1] 
[1m-1] 
[1m-1] 


Name Command 
NCT cop2 0x0D80420 {Normal color v0, v1, v2 


ні 
н 
0 
в 
Q 
u 


In: V 
B 
C 
L 
L 
ut: R 
[ 


о 





0,V1,V2 Normal vector 

K Background color 
ODE Code value from RGB. 
CM Color matrix 

LM Light matrix 

GBn RGB fifo. 
IR1,IR2,IR3] Color vector 


[MAC1,MAC2,MAC3] Color vector 


RBK, GBK, BBK 
CODI 


Gl 





Rn, Gn, Bn, CDn 


Calculation: Same as NCS, but repeated for V1 and V2. 


NCDS cop2 0x0E80413 [Normal color depth сиеуО 


Fields: 

n: V 
B 
R 
L 
L 
I 

ut: R 
[ 
[ 


H 


о 


Calculatio 
1,19,12] 
1:521:9:/:352:1| 


0 Normal vector 

K Background color 

GB Primary color 

LM Light matrix 

CM Color matrix 

RO Interpolation value 
GBn RGB fifo. 
IR1,IR2,IR3] Color vector 


AC1,MAC2,MAC3] Color vector 


n: 
AC1=A1[L11*VX0 + L12*VYO + L13*V20] 
АС2=А1 [L21*VX0 + L22*VYO + L23*VZ0] 





1, 19,42) 
123.12) 
І, 3712] 





АСЗ=А1 [L31*VX0 + L32*VYO + L33*VZ0] 
IR1= Lm В1|МАС1| 
IR2- Lm B2[MAC2] 





1,3,12] 

1,19,12] 
1,19,12] 
1,19,12] 
1,3,12 


IR3- Lm B3[MAC3] 
MAC1-A1 


RBK, GBK, BBK 
R,G,B, CODE 





Rn, Gn, Bn, CDn 


ВВК + LR1I*IR1 + LR2*IR2 + LR3*IR3] 


Т 
МАС2=А1 [СВК + 12615151 + LG2*IR2 + LG3*IR3] 
[ 


МАСЗ=А1 
ІК1= Lm В1|МАС1| 





do qm 
3 ВА 
1:2 754 
1,2754 
1.27.4 
d 3 5 


TR2= Lm B2[MAC2] 
IR3- Lm B3[MAC3] 


МАС1=А1 [R*I 
MAC2-A1[G*I 
MAC3=A1 [B*I 
IR1= Im ВІЇ 
IR2= Lm B2[ 


R1 + IRO* (Lm_B1[RFC-R*IR1]) ] 
R2 + IRO*(Lm B2[GFC-G*IR2])] 
R3 + IRO* (Lm_B3[BFC-B*IR3]) ] 
AC1] 
AC2] 











IR3- Lm B3[ 


С40«-С41«-С42«- COD 





AC3] 


Gl 


RO«-R1«-R2«- Lm С1|МАС1| 
G0«-G1«-G2«- Lm C2[MAC2] 








BO«-B1«-B2«- Lm C3[MAC3] 


ВВК + LB1*IR1 + LB2*IR2 + LB3*IR3] 














1,19, 24 
1, 15,24 
1,19,24 
1,19,12 
Тото 
1,19,12 
1,19,24 
1,19,24 
1,19,24 
тота T2 
115,12 
1-78 А 
1.2716 
1,27,16 
1237. 16 
1,27,4] 
1,274] 
1,27,4] 





1,27,4] 
1,27,4] 
1,27,4] 


NCDT cop2 0x0F80416 [Normal color depth cue vO, v1, v2 


Fields: 

In: VO 
V1 
V2 
BK 
FC 
RGB 
LLM 
LCM 
IRO 

ut: RGBn 


о 


[IR1,IR2,IR3] 
[MAC1,MAC2,MAC3] 


Calculation: 


Normal vector 
Normal vector 
Normal vector 
Background color 
Far color 
Primary color 
Light matrix 
Color matrix 
Interpolation value 
RGB fifo. 

Color vector 
Color vector 


Same as NCDS but repeats for vl and v2. 


RBK, GBK, BBK 
RFC,GFC, BFC 
R,G,B,CODE 





Rn,Gn,Bn,CDn 








NCCS cop2 0x108041B |Normal color col. vO 


























RBK, GBK, BBK 
R,G,B,CODE 





Rn,Gn,Bn,CDn 


Fields: 
In: VO Normal vector 
BK Background color 
RGB Primary color 
LLM Light matrix 
LCM Color matrix 
Out: RGBn RGB fifo. 
[IR1,IR2,IR3] Color vector 
[MAC1,MAC2,MAC3] Color vector 
Calculation: 
1,19,12] МАС1=А1 [111*УХ0О + L12*VYO + L13*VZ20] 
1,19,12] MAC2-A2[L21*VXO + L22*VYO + L23*VZ0] 
1,19,12] MAC3=A3[L31*VxX0 + L32*VYO + L33*VZ0] 
1,312) ІК1= Lm ВІ[МАС1] 
1,19,12] [1m-1] 
1,3,12] IR2- Lm B2[MAC2] 
1,19,12] [1m-1] 
152:32:252:1 IR3= Lm B3[MAC3] 
1,19,12] [1m-1] 
1,19,12] МАС1=А1[ЕВК + LR1*IR1 + LR2*IR2 + LR3*IR3] 
1,19,12] MAC2=A2[GBK + LG1*IR1 + LG2*IR2 + LG3*IR3] 
1,19,12] MAC3=A3[BBK + LB1*IR1 + LB2*IR2 + LB3*IR3] 
1 3;42 ІК1= Lm ВІ[МАС1] 
1,19,12] [1m-1] 
Ae 3g 2 IR2- Lm B2[MAC2] 
1,19,12] [1m-1] 
13,42 IR3- Lm B3[MAC3] 
1,19,12] [1m-1] 
T 72:h,54 МАС1=А1 [R*IR1] 
1,254:4 MAC2=A2 [G*IR2] 
1,27,4 МАСЗ=АЗ [B*IR3] 
1,3,12 ІК1= Lm ВІ[МАС1] 
12.95.12. IR2- Lm B2[MAC2] 
l;3,412 IR3- Lm B3[MAC3] 
0,8,0] С40«-С41«-С42«- CODE 
0,8,0] RO«-R1«-R2«- Lm С1|МАС1| 











[1,19,24] 
[1,19,24] 
[1,19,24] 


[1,19,24] 
[1,19,24] 
[1,19,24] 


1,27,16] 
129,16] 
1,27,16] 
1,27,4] [1m-1] 
1,27,4] [1m-1] 
1,27,4] [1m-1] 





1,27,4] 


[0,8,0] GO<-G1<-G2<- Lm C2[MAC2] [1,27,4] 











[0,8,0] BO«-B1«-B2«- Lm C3[MAC3] [1,27,4] 
Name 
NCCT 

Fields: 

In: VO Normal vector 1 1,3512 

V1 Normal vector 2 15:35:12 
V2 Normal vector 3 14,3712 
BK Background color RBK, GBK, BBK 1;19,12] 
RGB Primary color R,G,B, CODE 0,8,0] 
LLM Light matrix 15:35 12 
LCM Color matrix 1,23,12 
Out: RGBn RGB fifo. Rn,Gn,Bn,CDn [0,8,0] 
[IR1,IR2,IR3] Color vector 1,11,4 
[MAC1,MAC2,MAC3] Color vector 1,27,4 








Calculation: 
Same as NCCS but repeats for vl and v2. 


Name 
CDP cop2 0x1280414 |Color Depth Queue 





























Fields: 
In: [IR1,IR2,IR3] Vector 1,3512 
RGB Primary color R,G, B, CODE 0,8,0] 
IRO Interpolation value 1,312 
BK Background color RBK, GBK, BBK 1,19,12] 
LCM Color matrix 17.33.12 
FC Far color ВЕС, GFC, BFC 1,27,4 
Out: RGBn RGB fifo Rn,Gn,Bn,CDn [0,8,0] 
[IR1,IR2,IR3] Color vector 1,11,4 
[MAC1,MAC2,MAC3] Color vector 1,27,4 
Calculation: 
1,19,12] МАС1=А1 [ВВК + LR1*IR1 + LR2*IR2 + LR3*IR3] 1,19,24 
1,19,12) MAC2=A2[GBK + LG1*IR1 + LG2*IR2 + LG3*IR3] 1,19,24 
1,19,12] MAC3=A3[BBK + LB1*IR1 + LB2*IR2 + LB3*IR3] 1,19,24 
КС el ІК1= Lm ВІ[МАС1] 1,19,12] [1m 
1,3,12 TR2= Lm B2[MAC2] 1,19,12] [1m 
132542 IR3- Lm B3[MAC3] 1,19,12] [1m 
1,27,4 МАС1=А1 [R*IR1 + IRO*(Lm Bl1[RFC-R*IR1])] 1,27,16] [1m 
1,254 MAC2-A2[G*IR2 + IRO*(Lm B2[GFC-G*IR2])] 1,27,16] [1m 
1,27,4 MAC3-A3[B*IR3 + IRO*(Lm B3[BFC-B*IR3])] 1,27,16][1m 
1,235,112 ІК1= Lm ВІ[МАС1] 1,27,4] [1m- 
L;3,12 TR2= Lm B2[MAC2] 1,27,4] [1m- 
1r, 3,42 IR3- Lm B3[MAC3] 1,27,4] [1m- 
0,8,0] С40«-С41«-С42«- CODE 
0,8,0] RO«-R1«-R2«- Lm С1|МАС1| Lp 2754] 
0,8,0] С0«-01«-02«- Lm C2[MAC2] 127,14] 
0,8,0] BO«-B1«-B2«- Lm C3[MAC3] T2774] 








Name 
СС сор 0х138041С 





Fields: 

In: [IR1,IR2,IR3] Vector [1,3,12] 
BK Background color RBK,GBK,BBK  [1,19,12] 
RGB Primary color R,G,B, CODE [0,8,0] 
LCM Color matrix [1,3,12.] 









































Out: RGBn RGB fifo. Rn,Gn,Bn,CDn [0,8,0] 
[IR1,IR2,IR3] Color vector 1,11,4] 
[MAC1,MAC2,MAC3] Color vector 1,27,4] 

Calculations: 

1,19,12] МАС1=А1[ЕВК + LR1*IR1 + LR2*IR2 + LR3*IR3] 1,19,24 
1,19,12] MAC2=A2[GBK + LG1*IR1 + LG2*IR2 + LG3*IR3] 1,19,24 
1,19,12] МАСЗ=АЗ[ВВК + LB1*IR1 + LB2*IR2 + LB3*IR3] 1,19,24 

1.3.19 IR1- Lm В1|МАС1| 1,19,12] [lm=1] 
123312 IR2- Lm B2[MAC2] 1,19,12] [lm=1] 
1,3,12 IR3= Lm_B3[MAC3] 1,19,12] [lm=1] 
1,27,4] MAC1=A1[R*IR1] 1,27,16 

1,27,4] MAC2=A2[G*IR2] 1,27,16 

1,27,4] MAC3=A3[B*IR3] 1,27,16 

1,3,12 IR1= Lm В1|МАС1| 1,27,4] [1m-1] 
1,3,12 IR2= Lm B2[MAC2] 1,27,4] [1m-1] 
1,3,12 IR3- Lm B3[MAC3] 1,27,4] [1m-1] 
0,8,0] Cd0«-Cd1«-Cd2«- CODE 

0,8,0] RO«-R1«-R2«- Lm С1|МАС11 1,27,4] 

0,8,0] GO0«-G1«-G2«- Lm C2[MAC2] 1,27,4] 

0,8,0] BO«-B1«-B2«- Lm C3[MAC3] 1,974] 

| NCLIP | 8 | cop20x1400006 [Normalclipping | (| — | 

Fields: 

in: SXY0,SXY1,SXY2 Screen coordinates [15,0 

out: MACO Outerproduct of 5ХҮ1 and SXY2 with  [1,31,0] 

5ХҮО as origin. 

Calculation: 

[1, 31,0] MACO = F[SXO*SY1-SX1*SY2-SX2*SYO-SXO*SY2-SX1*SYO-SX2*SY1] [1,43,0] 

Fields: 

in: 821; 542; S23 Z-Values 0,16,0] 

ZSE3 Divider 1335-12] 

out: OTZ Average. 0,16,0] 

MACO Average. 1,31,0] 

Calculation: 

[1,31,0] MACO-F[ZSF3*SZ1 + ZSF3*SZ2 + ZSF3*SZ3] 1,31,12] 

[0,16,0] OTZ-Lm D[MACO] 1,31,0] 

| AVSZA | 6 | сор? 0х168002Е [Average of four Z values | | (| (| | 

Fields: 

in: SZ1,S22,S23,SZ24 Z-Values 0,16,0] 
25Е4 Divider 1,9572] 

out: OTZ Average. 0,16,0] 

MACO Average. 1,31,01 

Calculation: 

[1,31,0] MACO-F[ZSFA4*SZO + ZSFA*SZ1 + ZSFA*SZ2 + ZSF4*SZ3] 1,31,12] 

[0,16,0] OTZ-Lm D[MACO] 1,31,0] 


[| 6 | cop2 0x170000C (Ошег Product 


Name Description 





Fields: sf 

in: [R11R12,R22R23,R33] vector 1 
[IR1,IR2,IR3] vector 2 

out: [IR1,IR2,IR3] outer product 
[MAC1,MAC2,MAC3] outer product 


Calculation: (D1=R11R12,D2=R22R23,D3=R33) 


MAC1=A1[D2*IR3 - D3*IR2] 
MAC2-A2[D3*IR1 - D1*IR3] 
MAC3-A3[D1*IR2 - D2*IR1] 
IR1=Lm_B1 [MACO] 
IR2-Lm В2 [МАС1] 
IR3-Lm ВЗ [MAC2 ] 


Мате 
СРЕ Гб | сор2 0x190003D |General purpose interpolation 

















Fields: sf 
in: IRO scaling factor 
CODE code field of RGB 
[IR1,IR2,IR3] vector 
out: [IR1,IR2,IR3] vector 
[MAC1,MAC2,MAC3] vector 
RGB2 RGB fifo. 
Calculation: 
MAC1=A1[IRO * ІВ1] 
MAC2=A2[IRO * IR2] 
MAC3=A3[IRO * IR3] 
IR1=Lm_B1 [МАС1] 
IR2=Lm_B2 [МАС2] 
IR3=Lm_B3 [MAC3] 
(0,8,01 Cd0<-Cd1<-Cd2<- CODE 
[0,8,0] RO«-R1«-R2«- Lm С1[МАС1] 
[0,8,0] G0«-G1«-G2«- Lm C2[MAC2] 
[0,8,0] В0<-В1<-В2<- Lm C3[MAC3] 
Fields: sf 
in: IRO scaling factor 
CODE code field of RGB 
[IR1,IR2,IR3] vector 
[MAC1,MAC2,MAC3] vector 
out: [IR1,IR2,IR3] vector 
[MAC1,MAC2,MAC3] vector 
RGB2 RGB fifo. 
Calculation: 


МАС1=А1 [МАС1 + IRO * IR1] 
MAC2=A2 [МАС2 + IRO * IR2] 
МАСЗ=АЗ [MAC3 + IRO * IR3] 
IR1=Lm_Bl [MAC1 ] 
IR2=Lm_B2 |МАС21 
IR3=Lm_B3 [MAC3] 
[0,8,0] Cd0<-Cd1<-Cd2<- COD 





Gl 


[0,8,0] RO«-R1«-R2«- Lm С1|МАС1| 
[0,8,0] G0«-G1«-G2«- Lm C2[MAC2] 
[0,8,0] BO<-B1<-B2<- Lm C3[MAC3] 





e Field descriptions. 


sf 0 vector format (1,31, 0) 
vector format (1,19,12) 


- 


mx 0 Multiply with rotation matrix 
1 Multiply with light matrix 
2 Multiply with color matrix 
3 Unknown 

у 0 VO source vector (short) 
1 V1 source vector (short) 
2 V2 source vector (short) 
3 IR source vector (long) 

cv 0 Add translation vector 
1 Add back color vector 
2 Unknown 
3 Add no vector 

Im 0 No negative limit. 


- 


Limit negative results to 0. 


A list of common MVMVA instructions: 


| по | - — | сор? 0х0486012 yO*rotmarix |  (  ( ( ( ( (4 | 
| п: | — - | сор20х048Е012 У1 *rotmatrix | 
| пу |  - | сорд 0х0496012 y2*rotmarix | 
| ш? | — - | сор20х049ЕО12 jr*rotmatrix | | (| 
| то |  - | | сор20х041Е012 jr*rotmarix | | ( y O 
| пи |  - | сор20х0480012 М0 *rotmatrix+trvector | 
| пуш |  - | сор20х0488012 М1 *rotmatrix+trvector | 
| пш | 0-0 | сор20х0490012 М *rotmatrix+trvector | 
| rürr | 0-0 | cop20x0498012 jr*rotmarix-etrvector o 
| пук |  - | сор20х0482012 yO*rotmatix--bkvectr i y O 
[мук |  - | сор? 0х048А0І2 yl*rotmatix--bkvectr | | (| 
| пук |  - | cop20x0492012 М2 *rotmatrix+ bk vector | o (| 
| rübk |  - | сор2 0х049А0І2 jr*rotmarix + bk vector | (| 
[u | - | cop20x04A6412 yO*lightmatrix Lower limitresulttoQ | 
| о | - | cop20x04A6012 yO*lightmatix | 
[uvi | -  |cop20x04AEOI2 МІ * вита | 
| Шу | - | cop20x04B6012 y2*lightmatix | | | 





Ilvir 


- cop2 0х04ВЕО12 jr * light matrix 


| dvor | - | cop2 0x04A0012 0 * light matrix + tr vector 
[Шш | - | сор2 0x04A8012 |(у1 * light matrix + tr vector 


]lv2tr 
llirtr 
llvObk 
Пу15К 
llv2bk 
llirbk 


Ісу0 
1су1 
Icv2 
Icvir 
IcvOtr 
Icvltr 
Icv2tr 
Icirtr 
levObk 
levibk 
lev2bk 


| - | cop20x04BO012 v2 * light matrix + tr vector | 
| - | cop20x04B8012 фі * light matrix + tr vector | | | 
| з | cop20x04A2012 №0 * light matrix + bk vector | 
|. -  |«op20x04AA012 [v1 * light matrix + bk vector | | 
| - | cop? 0x04B2012 №2 * light matrix + bk vector | 
| - | cop20x04BAOI2 іг * light matrix + bk vector | 
| - | сор? 0x04DA412 №0 * color matrix, Lower limit clamped t00 | 
| - | сор? 0x04C6012_ 0 * color тайх | 
| - | сор2 0х04СЕО12- Уг * color matrix | 
| - | сор? 0х0406012 № * color matrix | 
| - | сор? 0x04DE012 jr*colormatix | 
| - | cop20x04CO012 0 * color matrix + tr vector | 
| з | cop20x04C8012 V1 *colormatrix+trvector | 
| - | сор? 0х0400012 №2 * color matrix + tr vector | 
| - | cop2 0x04D8012 їг“ color matrix + tr vector | 
| - | сор? бхо4С2012 №0 * color matrix + bk vector | | (| 
| - | cop20x04CAOI2 М1 * color matrix + bk vector | | | 
| - | cop20x04D2012 №2 * color matrix + bk vector | 





| leibk | - | сор20х04рА012 


e Other instructions: 


| sqri2 |  - | cop20x0A80428 Бава о р 
| зо | - | cop20x0A80428 squareofir BO 
| opl2 |  - | сор? 0х178000С puter product — | — (|| 11912000 | 


| - | cop20x170000C puterproduct | | | 01310 


[ gpfi2 |  - | cop20xI98003D [eneral purpose interpolation — 1.19.12 | 
| gpfü |  - | cop20xI90003D [eneral purpose interpolation — 131.0 | 
| gpl2 |  - | сор? 0x1A8003E [general purpose interpolation — 1.19.12 | 
| gpl | - | cop2.0x1A0003E [general purpose interpolation | 1310 | 





The Motion Decoder (MDEC) 


The Motion Decoder (MDEC) is a special controller chip that takes a compressed JPEG-like images and 
decompresses them into 24-bit bitmapped images for display by the GPU. The MDEC can only decompress a 16x16 
pixel 24-bit image at at time,called "Macroblocks" These Macrobock are encoded block that uses the YUV (YCbCr) 
color scheme with Discrete Cosine Transformation (DCT) and Run Length Encoding (RLE) applied The MDEC 
also performs 24 to 16 bit color conversion to prepare it for whatever color depth the GPU is in. Due to the 
extremely high speed that the decompression is done, the decompressed RGB bitmaps can be combined to from 
larger pictures and then ,if displayed in sequential order, to produce movies. The maximum speed is about 9,000 
macroblocks per second, thereby making a movie that is 320x240 able to be played at about 30 frames per second. 
MDEC data can only be sent/received via DMA channels 0 and 1. DMA channel 0 is for uncompressed data going in 
and channel 1 is for retrieval of the uncompressed macroblocks. The MDEC gets controlled via the MDEC control 
register at location 51180 1820. The current status of the MDEC can be checked using the MDEC status register at 
$1f80 1824. The following is a layout of the registers. 


51180 1820 (mdecO) 
write: 


31 26 25 24 0 


28 27 


Note: The first word of every data segment in a str-file is a control word written to this register. 


u Unknown 
RGB24 should be set to 0 for 24-bit color and to 1 for 16-bit. In 16-bit mode 
STP toggles whether to set bit 15 of the decompressed data (semi-transparency) 


51180 1824 (mdecl) 
read: 


3l 


30 29 28 27 26 25 24 23 22 0 
FIFO RGB24 


u Unknown 
FIFO First-In-First-Out buffer state 
InSync MDEC is busy decompressing data 
OutSync MDEC is trasnferring data to man memory 
DREQ Data Request 
RGB24 0 for 24-bit color and to 1 for 16-bit. In 16-bit mode 
STP toggles whether to set bit 15 of the decompressed data (semi-transparency) 
write: 
31 30 0 
и Unknown 


reset reset MDEC 


MDEC Data Fomat 


The MDEC uses a ‘lossy’ picture format simalar to that of the JPEG file format. A typical picture, before 
being put into the MDEC via DMA, is of the following format; 





a The header is a 32 byte word. 


1615 
0x3800 Data ID 
size size if data after the header 


Ф The Macrobocks are further broken up as follows 


Cb,Cr The color difference blocks 
YO, YLY2,Y3 Тһе Luminescence blocks 





© Within each block the DCT informaton and RLE compressed is is stored. 





е рСТ DCT data, it has the quantization is and the Direct Current (DC) reference 
109 


Quantization factor (6 bits, a 


bo Direct Current reference (10 bits, signed) 
же RLE Run length data 
109 
LENGTH The number if zeros between data (6 bits, unsigned) 
DATA The data (10 bits, signed) 


Ф EOD End Of Data(Footer) 


15 0 


Oxfe00 


Lets the MDEC know a block is done. The footer is also the same thing. 


SOUND 


SPU - Sound Processing Unit 


Introduction. 
The SPU is the unit responsible for all aural capabilities of the psx. It handles 24 voices, has a 512kb sound 
buffer. It also has ADSR envelope filters for each voice and lots of other features. 


The Sound Buffer 

The SPU has control over a 512kb sound buffer. Data is stored compressed into blocks of 16 bytes. Each 
block contains 14 packed sample bytes and two header bytes, one for the packing and one for sample end and 
looping information. One such block is decoded into 28 sample bytes (= 14 16bit samples). 

In the first 4 kb of the buffer the SPU stores the decoded data of CD audio after volume processing and the 
sound data of voice 1 and voice 3 after envelope processing. The decoded data is stored as 16 bit signed values, 
one sample per clock (44.1 khz). 

Following this first 4kb are 8 bytes reserved by the system. The memory beyond that is free to store 
samples, up to the reverb work area if the effect processor is used. The size of this work area depends on which type 
of effect is being processed. More on that later. 


Voices 

The SPU has 24 hardware voices. These voices can be used to reproduce sample data, noise or can be used 
as frequency modulator on the next voice. Each voice has it's own programmable ADSR envelope filter. The main 
volume can be programmed independently for left and right output. 
The ADSR envelope filter works as follows: 


lul 


T 
S1 Хус 





Rr 
>t 

Ar Attack rate, which specifies the speed at which the volume increases from zero to it's maximum value, as 
soon as the note on is given. The slope can be set to lineair or exponential. 
Dr Decay rate specifies the speed at which the volume decreases to the sustain level. Decay is always 
decreasing exponentially. 
SI Sustain level, base level from which sustain starts. 
Sr Sustain rate is the rate at which the volume of the sustained note increases or decreases. This can be either 
lineair or exponential. 
Rr Release rate is the rate at which the volume of the note decreases as soon as the note off is given. 
М Volume level 


t Time 


The overal volume can also be set to sweep up ог down Ппеашу or exponentially from it's current value. This can be 
done seperately for left and right. 


SPU Operation 
The SPU occupies the area Ox1f80. 1c00-0x1f80 1dff. All registers are 16 bit wide. 
0x1f80 1c00-0x1f80 1471 Voice data area. For each voice there are 8 16 bit registers structured 
like this: 
0х1180 1xx0-0x1f80 1хх2 Volume 


(хх = Охс0 + voice number) 


х 1180 1xxO  |Volume Left 
х 1180 Іхх2 [Volume Right 





Volume mode: 

15 14 13 
[uq cm. ec. Ш 
VV 0x0000-Ox3fff Voice volume. 

S 0 Phase Normal 
1 Inverted 


о 


Sweep mode: 
15 14 13 12 1 76 


| 1 [SL [рр — — у 


1 
VV 0x0000-0x007f Voice volume. 


© 


| 


$1 0 Lineair slope 

1 Exponential slope 
Dr 0 Increase 

1 Decrease 
Ph 0 Normal phase 

1 Inverted phase 


In sweep mode, the current volume increases to its maximum value, or decreases to its mimimum value, according to 
mode. Choose phase equal to the the phase of the current volume. 


0х1180 1xx4 Pitch 
15 1413 


© 


| 


Ї 
Pt 0x0000-Ox3fff Specifies pitch. 
Any value can be set, table shows only octaves: 
0x0200 -3 octaves 
0x0400 -2 
0x0800 -1 
0х1000 sample pitch 
0x2000 +1 
Ox3fff 42 


0x1f80_1xx6 Start address of Sound 
15 


о 


Addr 
Addr  Startaddress of sound in Sound buffer /8 


0x1f80 1xx8 Attack/Decay/Sustain level 


A 
оо 
-4 
A 
чә 
о 


15 


Ат 0 Attack mode Linear 
1 Exponential 

Ar 0-7f attack rate 

Dr 0-f decay rate 

SI 0-f sustain level 


0x1f80 1xxa Sustain rate, Release Rate. 
15 14 13 12 6 5 4 


[Sm] Sd] 0 | 85 | Rm | Rr 


о 


Sm 0 sustain rate mode linear 
1 exponential 

Sd 0 sustain rate mode increase 
1 decrease 

Sr 0-7f Sustain Rate 

Rm 0 Linear decrease 
1 Exponential decrease 


Rr 0-1ғ Release Rate 
Note: decay mode is always Expontial decrease, and thus cannot be set. 


0х1180 1xxc Current ADSR volume 


jak 
сл 
© 


ASDRvol 
ADSRvol Returns the current envelope volume when read. 


0х1180 1xxe Repeat address. 


= 
сл 
© 


Ка 0x0000-Oxffff ^ Address sample loops to at end. 
Note: Setting this register only has effect after the voice has started (ie. KeyON), else the loop address gets reset 
by the sample. 


SPU Global Registers 
0x1f801d80 Main volume left 


0x1f801d82 Main volume right 
15 


о 


MVol 
Mvol 0х0000-0х Маш volume 
Sets Main volume, these work the same as the channel volume registers. See those for details. 


0x1f801d84 Reverberation depth left 
0x1f801d86 Reverberation depth right 


ES Куй 
Куа 0x0000-0x7fff ^ Sets the wet volume for the effect. 
P 0 Normal phase 

1 Inverted phase 


CA 
= 
> 
о 


Following registers have a common layout: 


first register: 
15 14 13 12 11 10 9 8 7 


с 
ул 
в 
чо 
ы 
— 
о 


с15 | cl4 | c13 | c12 | cll | c10 c9 c8 c7 c6 c5 c4 c3 c2 cl с0 
second register: 


15 8 7 6 5 4 3 2 1 0 
| Od et | сіб | el$ | сі4 | сіз | сі2 | eld. | elo | 
с0-с17 0 Mode for channel cxx off 

1 Mode for channel cxx on 


0x1f80 1488 Voice ON (0-15) 
0x1f80 1d8a Voice ON (16-23) 
Sets the current voice to key on. (ie. start ads) 


0x1f80 148с Voice OFF (0-15) 
0x1f80 1d8e Voice OFF (16-23) 
Sets the current voice to key off.(ie. release) 


0x1f80 1490 Channel FM (pitch Ifo) mode (0-15) 
0x1f80 1492 Channel FM (pitch Ifo) mode (16-23) 
Sets the channel frequency modulation. Uses the previous channelas modulator. 


0x1f80 1494 Channel Noise mode (0-15) 
0x1f80 1496 Channel Noise mode (16-23) 
Sets the channel to noise. 


0x1f80 1498 Channel Reverb mode (0-15) 
0x1f80 1d9a Channel Reverb mode (16-23) 
Sets reverb for the channel. As soon as the sample ends, the reverb for that channel is turned off. 


0x1f80 1d9c Channel ON/OFF (0-15) 
0x1f80 1d9e Channel ON/OFF (16-23) 
Returns wether the channel is mute or not. 


0x1f80 1da2 Reverb work area start 
15 


© 


MVol 
Revwa 0x0000-Oxffff ^ Reverb work area start in sound buffer /8 


0x1f80 1da4 Sound buffer IRQ address. 
15 


© 


IRQa 
IRQa  0x0000-Oxffff IRQ address in sound buffer /8 


0x1f80 14а6 Sound buffer IRQ address. 
15 0 
Sba 
Sba 0x0000-Oxffff | Address in sound buffer divided by eight. Next transfer to this address. 


0x1f80 1448 SPU data 
15 


© 


Data forwarding reg, for поп DMA transfer. 


0x1f80 144. SPU control sp0 


15 14 13 87 65 43 2 1 0 


Ce 


En 0 SPU off 
1 5РО оп 
Ми 0 Мие 5РО 
1 Unmute SPU 
Noise Noise clock frequency 
Rv 0 Reverb Disabled 
1 Reverb Enabled 
Irq 0 Irq disabled 
1 Irq enabled 
DMA 00 
01 Non DMA write (transfer through data reg) 
10 DMA Write 
11 ОМА Read 
Ег 0 Reverb for external off 
1 Reverb for external оп 
Cr 0 Reverb for CD off 
1 Reverb for CD оп 
Ee 0 External audio off 
1 External audio on 
Ce 0 CD audio off 
1 CD audio on 


0x1f80 14ас SPU status 


15 0 
In SPU init routines this register get loaded with 0х4. 

0х1180 1dae SPU status 

15 1211 109 0 


| Dh | Rd | 


Dh 0 Decoding in first half of buffer 

1 Decoding in second half of buffer 
Rd 0 Spu ready to transfer 

1 Spu not ready 


Some of bits 9-0 are also ready/not ready states. More on that later. Functions that wait for the SPU to be ready, wait 
for bits а-0 to become 0. 


0x1f80_1db0 CD volume left 

0x1f80 1482. CD volume right 
15 14 

CDvol 0x0000-Ox7fff Set volume of CD input. 

0 Normal phase. 

1 Inverted phase. 

0х1180 1404. Extern volume left 

0х1180 1466. Extern volume right 


| P | Exvol 


Exvol  0x0000-0x7fff Set volume of External input. 
P 0 Normal phase. 
1 Inverted phase. 


о 


- 


CA 
= 
в 
о 


0х1ас0-&14ї# 


Ox1f80 1dcO 
Ox1f80 1dc2 
Ox1f80. 1dc4 
Ox1f80 14с6 
Ox1f80 14с8 
Ox1f80 1dca 
Ox1f80 14сс 
Ox1f80 14се 
Ox1f80 1440 
Ox1f80 1dd2 
0х1180 1444 
Ox1f80 1446 
Ox1f80 1448 
Ox1f80 1dda 
Ox1f80 144с 
Ox1f80 1dde 
Ox1f80. 1deO 
Ox1f80 14е2 
Ox1f80 1de4 
Ox1f80 1de6 
Ox1f80 1de8 
Ox1f80 1dea 
Ox1f80 14ес 
Ox1f80 14ее 
Ox1f80 Тағ 
Ox1f80 1df2 
Ox1f80 1484 
Ox1f80 ldf6 
Ox1f80 1df8 
Ox1f80 1dfa 
Ox1f80 Тағ 
Ox1f80 1dfe 


Reverb 


Reverb configuration area 


Lowpass Filter Frequency. 7fff = max value= no filtering 
Effect volume 0 - Ox7fff, bit 15 = phase. 


Feedback 


Delaytime(see below) 
Delaytime(see below) 
Delaytime(see below) 


Delaytime(see below) 


Delaytime 
Delaytime 


The SPU is equipped with an effect processor for reverb echo and delay type of effects. This effect 
processor can do one effect at a time, and for each voice you can specify wether it should have the effect applied or 


not. 


The effect is setup by initializing the registers Ох 14с0 to Ох Не to the desired effect. I do not exactly know 
how these work, but you can use the presets below. 

The effect processor needs a bit of sound buffer memory to perform it's calculations. The size of this 
depends on the effect type. For the presets the sizes are: 


Reverb off 
Room 

Studio small 
Studio medium 
Studio large 


0x00000 Hall 
0x026c0 Space echo 
0x01f40 Echo 
0x04840 Delay 
0х06Ёе0 Half echo 


OxOadeO 
OxOf6cO 

0x18040 
0x18040 
03c00 


The location at which the work area is location is set in register Ox1da2 and it's value is the location in the 
sound buffer divided by eight. Common values are as follows: 


Reverb off 


OxFFFE Hall 


OxEA44 


Коот OxFB28 Space echo OxE128 


Studio small FCI8 Echo OxCFF8 
Studio medium OxF6F8 Delay OxCFF8 
Studio large 0хЕ204 Half echo OxF880 


For the delay and echo effects (not space echo or half echo) you canspecify the delay time, and feedback. 
(range 0-127) Calculations are shownbelow. 


When you setup up a new reverb effect, take the following steps: 


-Turn off the reverb (bit 7 in sp0) 

-Set Depth to 0 

-First make delay & feedback calculations. 
-Copy the preset to the effect registers 
-Turn on the reverb 

-Set Depth to desired value. 


Also make sure there is the reverb work area is cleared, else you might get some unwanted noise. 


To use the effect on a voice, simple turn on the corresponing bit in the channel reverb registers. Note that these get 
turned off autmatically when the sample for the channel ends. 


Effect presets 
copy these in order to Ох 1dc0-Ox1dfe 


Reverb off: 

0х0000, 0х0000, 0х0000, 0х0000, 0х0000, 0х0000 
0х0000, 0х0000, 0х0000, 0х0000, 0х0000, 0х0000 
0х0000, 0х0000, 0х0000, 0х0000, 0х0000, 0х0000 
0х0000, 0х0000, 0х0000, 0х0000, 0х0000, 0х0000 


0х0000, 0х0000 
0х0000, 0х0000 
0х0000, 0х0000 
0х0000, 0х0000 


MOM o8 


Room: 
0x007D, 0x005B, 0x6D80, 0x54B8, OxBEDO, 0x0000, 0x0000, 0хВА80 
0х5800, 0х5300, 0x04D6, 0x0333, 0x03F0, 0x0227, 0x0374, 0х01ЕЕ 
0x0334, 0x01B5, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000 
0х0000, 0x0000, 0х018В4, 0x0136, 0x00B8, 0x005C, 0x8000, 0x8000 














Studio Small: 

0х0033, 0x0025 Ox70FO 0х4ЕА8 0хВСЕО 0х4410 0хС0Е0  Ох9С00 
0x5280 0х4ЕСО 0х03Е4 0х0318 0х03А4 0х02АЕ 0х0372  0x0266 
0x031C 0х0250 0х025С 0х018Е 0х022Е 0х0135  0х01р2  0x00B7 
OxO18F 0х0085 0х0084 0х0080 0х004С 0х0026 0х8000 0х8000 














Studio Medium: 

0х0081 0х007Е 0х70ЕО0 Ox4FA8 OxBCEO 0х4510 OxBEFO 0хВ4С0 
0x5280 0х4ЕСО0 0x0904 0х076В 0х0824 0х065Е 0х07А2  0x0616 
0х076С Ox05ED 0х05ЕС 0х042Е 0х050Е 0х0305 0х0462  0x02B7 
0x042F  0х0265 0х0264 0х01В2 0x0100 0х0080 0х8000 0х8000 























Studio Large: 

0х00Е3 0х00АЭ 0х6Е60 Ox4FA8 OxBCEO 0х4510 OxBEFO  0xA680 
0х5680 0х52С0 0х0рЕВ 0х0858 0х0р09 Ox0A3C 0х0Вр9 0х0973 
0х0859 0х08рА 0х0809 0х05Е9 0х07ЕС 0х0480 0х06ЕЕ 0х03р2 
0х05ЕА 0х0310 0х031С 0х0238 0х0154 0х00АА 0х8000 0х8000 

















На11: 

0x01A5 0х0139 0х6000 0х5000 0х4С00 0хВ800 0хВС00 0хс000 
0х6000  Ох5С00 0х15ВА 0х11ВВ 0х14С2 0Ox10BD 0х118С  0xODC1 
0х11С0 (00xODC3 Ox0DCO 0х09С1 0х0ВС4 0х07С1 0х0А00 О0х06Ср 


0x09C2 О0х05С1 О0х05СО 





Space Echo: 
0x033D 0х0231  Ox7E 
0х6000 0x5400 0х1Е 





0x1A32  Ox15EF 0х15ЕЕ 





00 
D6 














0х1056 Ox0AE1  OxOAEO 





Echo: 


0x0001 0х0001 0х7ЕЕЕ 
0x0000 0x0000 0х1ЕЕЕ 


0х1005 0х0005 0x00 
0х0000 0х0000 0x10 


Delay: 


00 
04 


0x0001 0х0001 0х7ЕЕЕ 
0x0000 0х0000 0х1ЕЕЕ 


0х1005 0х0005 0x00 
0x0000 0х0000 0x10 


Half Echo: 





00 
04 


0x0017  0х0013  0x70F0 


Ox5F80 0х54С0 0x03 


71 


0x0358 0х026А 0х0106 





0x01A0 0х00ЕЗ 0x00 


Delay time calculation: 


58 


0x041A 


0x5000 
0x1A31 
0x1055 
0x07A2 


Ox7FFF 
OxOFFF 
0x0000 
0х1002 


Ох7ЕЕЕ 
OxOFFF 
0x0000 
0х1002 


0х4ЕА8 
0x02AF 
0х011Е 
0х0040 





0х0274 


0xB400 
0х1р14 
0х1334 
0х0464 


0х0000 
0х1005 
0х0000 
0х0004 


0х0000 
0х1005 
0х0000 
0х0004 


ОхВСЕО 
0х02Е5 
0x012D 
0x0028 





0x013A 


0хвооо 
0x183B 
OxOF2D 
0x0232 


0x0000 
0x0005 
0x0000 
0x0002 


0x0000 
0x0005 
0x0000 
0x0002 


0x4510 
0х01рЕ 
0х008В1 
0х0014 


0х8000 


0х4С00 
Ox1BC2 
0х11Е6 
0x8000 


0x0000 
0x0000 
0x0000 
0x8000 


0x0000 
0x0000 
0x0000 
0x8000 


OxBEFO 
0x02B0 
0х011Е 
0х8000 





Choose delay time in range 0-Ox7f. rX XXX means register Ox 180 XXXX. 


11944 = dt*64.5 - r1dcO 
г1446 = dt*32.5 - rldc2 


г1448 = rldda + dt*32.5 
г14е0 = г14е2 + dt*32.5 
ridf4 = ridf8 + dt*32.5 
rldf6 = rldfa + dt*32.5 


0x8000 


0хвооо 
0x16B2 
OxOC5D 
0x8000 


0x8100 
0x0000 
0x0000 
0x8000 


0x0000 
0x0000 
0x0000 
0x8000 


0x8500 
0x01D7 
0x0059 
0x8000 


The CD-ROM 


Overview 
The PSX uses a special two speed CD-ROM that can stream at 352K/sec.It uses the following registers to 
control it 
CDREGO = 0x1f80. 1800 
СОКЕСІ = 0x1f80. 1801 
CDREG2 = 0x1f80. 1802 
CDREG3 = 0x1f80. 1803 


REGISTER FORMAT 
CDREGO write: 0 to send a command 
1 to get the result 
read: I/O status 
bit O 0 ВЕСІ command send 
1 REGI data read 
bit 1 0 data transfer finished 
1 data transfer ready/in progress 
bit 7 1 command being processed. 
CDREGI write: command 


read: results 


CDREG2 write: send arguments 
write: 7 = flush arg buffer? 
CDREG3 write: 7 = flush irq 
read: hi nibble: unknown 
low nibble: interrupt status 
MODES FOR SETMODE 


| ме | bh —  ,|  funtin — —  — | 
1 


bit 4 Of 2328 byte | | | | 
| — MReot | | | Би? — —  fRepotoff | || | | Report on 





These modes can be set using the setmode command, 
Status bits: 


Standby spindle motor rotating 


These are the bit values for the status byte recieved from CD commands. 
Interrupt values: 


DataReady Data Ready 
Acknowledge Command Complete 





Acknowledge 
End of Data Detected 





These are returned in the low nibble of CDREG3. First write a 1 to CDREGO before reading CDREG3. When a 
command is completed it returns 3. To acknowledge an irq value after you've handled it, write a 1 to CDREGO then a 
7 to both CDREG2 and CDREG3. Another interrupt may be queued, so you should check CDREG3 again if 0 or if 
there's another interrupt to be handled. 


| бус | | охо | stats | 

Es Бы... | РТ | | 0 quo =... status 

Г Selo | | 00 | | minsecsector | status 

| Рау | оз | B | =- ^. status 

| Forward | охи | B | || - | status 

| Backward | 00 | B | =- | status 

| _ кей | ов | B | =- ^. status 

| Standby | — 0x0 | B | =- status 

is Stop Ji: 20:08. | ово 40 1 x ./ status 

| Pas | | 0х09 | B | - f status 

| dit — [| бш ПОНИ ПО status 

| Mute | ош | | - 1 status 

| Demute | — Од — | | - — status 

| Setfilter | оош | — [|  filechanel | status 

| Setmode | 0 | (| mod | status 

| Gepaam | о | status, mode.file?,chan?,?,? 
[| GelcL | бою | (| - 1 min,sec,sector,mode,file,channel 
| GetlocP | «п | track index.min,sec.frame,amin, asec, aframe 
| GetTN | | хз | - status,first.total (BCD) 

| _ GetM | | | | м | status,min,sec (BCD) 

| Sek | 05 | B | * | status 

| Seke | o | B |  * P^. status 

| Ts | 019 | B | # k| depends on parameter 

| D | ол | B — | -~ |  sucessflgl.flag2,00 4 letters of ID (SCEx) 
| кез | Оов | в | - k| status 

| Reset | ос | status 

| RedTOC | | OE | B | - | status 


* These commands' targets are set using Setloc. 
# Command 19 is really a portal to another set of commands. 


B means blocking. These commands return an immediate result saying the command was started, but you need to 
wait for an IRQ in order to get real results. 


Command ao 
Number NE 
0x00 Sync ommand does not succeed until all other commands complete. This can be used 
го 0 ай 


0x02 Setloc his command, with its parameters, sets the target for commands with a * for their 
parameter list. 


0x03 Play Plays audio sectors from the last point seeked. This is almost identical to 
dlReadS, believe it or not. The main difference is that this does not trigger a 
ompleted read IRQ. CdlPlay may be used on data sectors However, all sectors 
from data tracks are treated as 00, so no sound is played. As CdlPlay is reading, the 
audio data appears in the sector buffer, but is not reliable. Game Shark 
enhancement CDs" for the 2.x and 3.x versions used this to get around the PSX 
Opy protection. 


| 0x04 — | Forward | [Seek to next track ? 


Г | Backward [беек to beginning of current track, or previous track if early in current track (like а 
D player's back button) 


ши ReadN Read with retry. Each sector causes an IRQ (type 1) if ModeRept is on (I think). 
IReadN and ReadS cause errors if you're trying to read a non-PSX CD or audio CD 
ithout a mod chip. 


[m] Standby D-ROM aborts all reads and playing, but continues spinning. CD-ROM does not 
attempt to keep its place. 


Stops motor. Official way to restart is ОА, but almost any command will restart it. 


[pe x ike Standby, except the point is to maintain the current location within reasonable 
error. 


| 00А | m | ultiple effects at once. Setmode = 00, Standby, abort all commands. 
urn off CDDA stream to SPU. 
urn on CDDA streaming to SPU. 


0x0D Setfilter Automatic ADPCM (CD-ROM XA) filter ignores sectors except those which have 
he same channel and file (parameters) in their subheader area. This is the 
mechanism used to select which of multiple songs in a single XA to play. Setfilter 
does not affect actual reading (sector reads still occur for all sectors). 
list. 


| | OxOF | Getparam | eturns status, mode, file, channel, ?, ? 


Ox10 GetlocL etrieves first 6 (8?) bytes of last read sector (header) This is used to know where 
he sector came from, but is generally pointless in 2340 byte read mode. АП results 
are in BCD ($12 is considered track twelve, not eighteen) Command may execute 
oncurrently with a read or play (GetlocL returns results immediately). 
0х11 GetlocP etrieves 8 of 12 bytes of sub-Q data for the last-read sector. Same purpose as 
etlocL, but more powerful, and works while playing audio. АП results are in 
CD. See note 


et first track number and number of tracks in the TOC. 
ets start of specified track (does it return sector??) 


o the sector) 
о the second) 





0х19 Тезї his function has many subcommands that are completely different. See ending 
otes 

NOTES 

Ф the sub-Q fromat is as follows 
track: track number (5АА for lead-out area) 
index: index number (INDEX lines in CUE sheets) 
min: minute number within track 
sec: second number within track 
frame: sector number within "sec" (0 to 74) 
amin: minute number on entire disk 
asec: second number on entire disk 
aframe: sector number within "asec" (0 to 74) 


© Test subcommands 

1A ID 

Returns copy protection status. StatError for invalid data CD, StatStandby for valid PSX CD or audio CD. The 
following bits I'm unsure about, but I think the 3rd byte has $80 bit for "CD denied" and $10 bit for "import". $80 = 
copy, $90 = denied import, $10 = accepted import (Yaroze only). The Sth through 8th bytes are the SCEx ASCII 
string from the CD. 

1B ReadS 

Read without automatic retry. 

1C Reset 

Same as opening and closing the drive door. 

1E ReadTOC 

Reread the Table of Contents without reset. 


To send a command: 

- First send any arguments by writing 0 to CDREGO, then all arguments sequentially to CDREG2 

- Then write 0 to CDREGO, and the command to CDREGI. 

To wait for a command to complete: 

- Wait until a CDrom irq occurs (bit 3 of the interrupt regs) The cause of the cdrom irq is in the low nibble of 
CDREG3. This is usually 3 on a succesful comletion. Failure to complete the command will result in a 5. If you 
don't wish to use irq's you can just check for the low nibble of cdreg3 to become something other than 0, but make 
sure it doesn't get cleared in any irq setup by the bios or some such. 


To Get the results 


- Write a 1 to CDREGO, then read CDREGO, If bit 5 is set, read a return value from CDREGI, then read CDREGO 
again repeat until bit 5 goes low. 


To Clear the irq 

- After command completion the irq cause should be cleared, do this by writing a 1 to CDREGO then 7 to CDREG2 
and CDREG3. My guess is that the write to CDREG2 clears the arguments previously set from some buffer. Note 
that irq's are queued, and if you clear the current, another may come up directly.. 


To init the CD: 


-Flush all irq's 
-CDREGO=0 


-CDREG3=0 
-Com_Delay=4901 ($1f801020) 
-Send 2 NOP's 

-Command $0a, no args. 
-Demute 


To set up the cd for audio playback 


CDREG0-2 
CDREG2-$80 
CDREG3=0 
CDREGO0-3 
CDREG1=$80 
CDREG2=0 
CDREG3=$20 


Also don't forget to init the SPU. (CDvol and CD enable especially) 


You should not send some commands while the CD is seeking. (ie. status returns with bit 6 set.) Thing is that the 
status only gets updated after a new command. I haven't tested this for other command, but for the play command 
($03) you can just keep repeating the command and checking the status returned by that, for bit 6 to go low(and bit 7 
to go high in this case) If you don't and try to do a getloc directly after the play command reports it's done, the cd will 
stop. (I guess the cd can't get it's current location while it's seeking, so the logic stops the seek to get an exact fix, but 
never restarts..) 


19 subcommands. 


For one reason or another, there is a counter that counts the number of SCEx strings received by the CD-ROM 
controller. 


Be aware that the results for these commands can exceed 8 bytes. 


0x05 Reset SCEx counter. This also sets 1A's SCEx response to 00 00 

Pu ttt dosent appear to forss a proton ЛИЙ 
0x20 Returns an ASCII string specifying where the CD-ROM firmware is 
шше wed СО ра, erue, o o 


Returns a chip number inside the PSX in use. 
| 0х23 [Е eturns another chip number. 
| 0х24 [Е eturns yet another chip number. Same as 228 on some PSXs 





Root Counters 


Overview 
Root counters are timers in the PSX. There are 4 root counters. 
Base address 


Ро | OxIf80 1100 
Ox1f80 1110 horizontal retrace 
Ox1f80. 1120 


Each have three registers, one with the current value, one with the counter mode, and one with a target value. 












Synced to 
















0х11п0 Count [read] 
31 1615 0 


Count Current count value, 0x0000-Oxffff 
Upper word seems to contain only garbage. 


0х11п4 Mode  [read/write] 


31 10 9 8 7 6 5 4 32 10 
| Garage jDiv|Cle| [142| 114 [Tar| | En | 
En 0 Counter running 

1 Counter stopped (only counter 2) 
Tar 0 Count to $ffff 

1 Count to value in target register 
191 Set both for ТКО on target reached. 
Iq2 
Сїс 0 System clock (it seems) 

1 Pixel clock (counter 0) 

Horizontal retrace (counter 1) 

Div 0 System clock (it seems) 

1 1/8 * System clock (counter 2) 


When Clc and Div of the counters are zero, they all run at the same speed. This speed seems to be about 8 times the 
normal speed of root counter 2, which is specified as 1/8 the system clock. 
Bits 10 to 31 seem to contain only garbage. 


0x1158 Count [read/write] 
31 1615 0 


Target Target value, Ox0000-Oxffff 
Upper word seems to contain only garbage. 


Quick step-by-step: 


To set up an interrupt using these counters you сап do the following: 
] - Reset the counter. (Mode = 0) 
2 - Set its target value, set mode. 
3 - Enable corresponding bit in the interrupt mask register ($1#801074) 
bit 3 = Counter 3 (Vblank) 
bit 4 = Counter 0 (System clock) 
bit 5 = Counter 1 (Hor retrace) 
bit 6 = Counter 2 (Pixel) 
4 - Open an event. (Орепеуеш bios call - 500, $08) 
With following arguments: 
a0-Rootcounter event descriptor or'd with the counter number. 
($£2000000 - counter 0, $£2000001 - counter 1,$£2000002 - counter 2, $#2000003 - counter 3) 
al-Spec = $0002 - interrupt event. 
a2-Mode - Interrupt handling ($1000) 
a3-Pointer to your routine to be excuted. 
The return value in VO is the event identifier. 


5 - Enable the event, with the corresponding bioscall ($b0,$0c) with the identifier as argument. 
6 - Make sure interrupts are enabled. (Bit 0 and bit 10 of the COPO status register must be set.) 
Your handler just has to restore the registers it uses, and it should terminate with a normal jr ra. 


To turn off the interrupt, first call disable event ($b0, 504) and then close it using the Close event сай ($b0,$09) both 
with the event number as argument. 


Controllers 


Overview 

The PSX uses a 9 pin device connecter for use with the PSX controller. The controller port is exactly the 
same electricly as the memory card port. The only difference is the device driver that uses it, and it's external port 
shape. The controllers are accessed via the InitPAD StartPAD, StopPAD, PAD init, and PAD dr BIOS commands. 
These are covered in detail within the BIOS section of this document. The controller is a type of "smart device" that 
communicates data serially via the port. Port informaton is as follows. 


pin 9 8 7 6 5 4 3 2 1 


| in | po fd ata from pad or memory-card 
ut ommand data to pad or memory-card 


о 


| out | 
[= | - __|+7.6V power source for CD-ROM drive | | | 
ee ae 


ee НН 

5 | GV | - | - НЗ.6У power source for system | 
| 6 | зі | ош | neg  gelectpadormemorycard 00000000) 
| 7 | ck | ou | аа ——— | 
езш eee лы чы с г) 

| 9 | ак | ш | neg  gcknowladge signal from pad or тетогу-сагі | 


Ф 1)direction(in/out) is based from PSX 
@ 2) metal edge in pad connecter is connected pin 4 and sheald calbe. 
© 3)signal SEL in PADI, PAD? is separated. 





Comminucation timing chart 
Timing is compatible in the PAD as well as the Memory-card. 


Overview 


SEL- | 








CLK 11111111 11111111 11111111 11111111 11111111 





CMD X Olh XXXX 42h XXXX 00h XXXX 00h XXXX 00h XXXX 





DAT. gases XXXXXXXXXXXXX ID XXXX 5Ah ХХХХ keyl ХХХХ key2  XXXX----- 


ACK |__| |= |__| 1! 





Тор command. First comminucation(device check) 


SEL- | 











CLK Е oes em БЕ 221701 В est ен ARS БЕ КЫ Pack slo 








CMD | | 





DAT. ээ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | 








АСК 








Х = попе, Н1-2 


0x81 is memory-card, 0x01 is standard-pad at top command. 

serial data transfer is LSB-First format. 

data is down edged output, PSX is read at up edge in shift clock. 

PSX expects No-connection if not returned Acknoledge less than 100 usec. 
clock pluse is 250KHz. 

no need Acknoledge at last data. 

Acknoledge signal width is more than 2 usec. 

time 15 l6msec between SEL from previous SEL. 

SEL- for memory card in PAD access. 


өөөөөөөөө 


Communication format with the Pad 
After the command 0х01һ is sent to the pad drom the system, the pad 
replies with a one-byte PAD ID(0x5A), then it will send a 2-byte key code and 
xtended code. 











Normal Pad timing flow -> 
10000000 1000010 1011010 1234567 1234567 
CMD 
хххххххх 10000010 10100101 1234567 1234567 


т Со == | m [к | за 52 | 


data contents of normal PAD.(push low) 

| bye | ь7 | be | ь5 | b4 | ьз | b2 | bl | ьо | 
Oo --1222--222-22- 5 аа Таса МА 
А 
7 
| 4 | ас | Х | О [тае RI | LI | R | 12 | 





data contents NEGCON(NAMCO analog controler, push low) 

| bye | b7 | b6 | bs | мм | з [| b | bl | | 
Е EE 
п. ою 200 
а а 21:08. гэгээ ЕЛЕУ EN DNE 
1.23. | ТЕРТ DOWN | RGET | ЫР. | А) 1 f t ПО _. 
ара ИЕГИ a ДР исар 
P| handledata right:0x00, center:0x80 Jo o O 
7 







- 






ы 







чә 













unknown data bit length in +6 to +8 ADC datas. (7 or 8 may be) 


mouse data contents(push low) 


| bye | b7 | bo | bs | b4 | в [| v2 | bl | ь | | 

Oe 1 11 А 

РО 21-022-2-2-2 рт NN 
2 


'Z' 


о o = | 

L5. ae 232 22290212 suos aq e ee С З ПЫР а 223 
12-25 НЕО ЗА л 3 А Б RG ВЕ А ИЗ 
У поке 8БйЅівпей up:+,dwn:-,stay:00 Jo | 
| 6 | CH moves 8bitSigned up:+dwn:-stay:00 | 





Memory cards 


Memory Card Format 


The memory card for the PSX is 128 kilobytes of non-volatile RAM. This is split into 16 blocks each 
containing 8 kilobytes each. The very first block is is a header block used as a directory and file allocation table 
leaving 15 blocks left over for data storage. 

The data blocks contains the program data file name, block name, icon, and other critical information. The 
PSX accesses the data via a "frame" method. Each block is split into 64 frames, each 128 bytes. The first frame 
(frame 0) is the file name, frames 1 to 3 contain the icon, (each frame of animation taking up one frame) leaving the 
rest of the frames for save data. 


[ Desi | 
разное | 
[Dabok | 





Terms and Data Format 
This is the format of the various objects within the memory card. 


File Name 

Country code(2 bytes)+Product number(10 bytes)+identifier(8 bytes) An example of a product number is 
SCPS-0000. The identifier is a variation on the name of the game, for example FF8 will be FF0800, FF0801. The 
format if the product is 4 characters, a hyphen, and then 5 characters. The actula characters don't really matter. With 
а PocketStation program, the product ID is a monochrome icon, a hyphen and the later part containing a "P" 


Country Code 
In Japan the code is BI, Europe is BE, and America is BA. An American PSX and use memory saves with 
the BI country code. 


Title 
The title is in Shift-JIS format with a max if 32 characters. ASCII can be used as ASCII is a subset of Shift- 
JIS. 


XOR Code 
This is a checksum. Each byte is XORed one by one and the result is stored. Complies with the checksum 
protocol. 


Link 
This is a sequence of 3 bytes to link blocks togeather to form one continuous data block. 


Data Size 
28КВ = 13,1072 bytes= 0x20000bytes 


1 
| Віоск | | ВКВ-8192 bytes = 0x2000 bytes 
1 


28 bytes = 0x80 bytes 


Header Frame 


Directory Frame 
0x00 
0 








A vailible bocks 
upper 4 bits 
A - Availible 
5- partially used 
F - Unusable 
Lowe 4 bits 
0 - Unused 
1 - There is no link, but one will be here later 
2 - mid link block 
3 - terminiting link block 
F - unusable 
Example 
AO - Open block 
51 - In use, there will be a link in the next block 
52 - In use, this is in a link and will link to another 
53 - In use, this is the last in the link 
FF - Unusable 


x01 -0x03 000000 
IWhen it's reservered it's FF FF FF 


0x04 - 0x07 |Use byte 
00 00 00 - Open block middle link block, or end link block 
Block * 0x2000 - No link, but will be a link 
(00 20 00 - one blocks will be used) 
(00 40 00 - two blocks will be used) 
(00 EO 01 - 15 blocks will be used ) 


0х08-0х09 (|Link order Block 0-14 
If the bock isn't in a link or if it's the last link in the line the line, it's Oxffff 


Ox0A-OxOB |Country Code (BI, ВА, BE) 


0х0С--0х15 [Product Code (AAAA-00000) 





Japan SLPS, SCPS (from SCEI) 
America SLUS, SCUS (from SCEA) 
Europe SLES, SCES (from SCEE) 


Ox16-0x1D Identifier 
This Number is created unique to the current game played. Meaning the first time a game is 
saved on the card, every subsequent save has the same identifier, but it a new game is started from the 


beginning, that will have a different idenitifier. 


0x1E-0x7E 
Ox7F___ KOR Code 


THE FOLLOWING DATA REPEATS FOR THE NEXT 15 BLOCKS, THEN BLOCK 1 STARTS 





Block Structure 
Frame 0 

Title Frame 

0x00 

'S' (0x53) 

0x01 

'С' (0x43) 

+0x02 

Icon Display Flag 
00...No icon 

11...Icon has 1 frame of animation (static) 
12...Icon has 2 frames 
13...Icon has 3 frames 
40x03 

Block Number (1-15) 
0x04 - 0x43 


Title 

Thisis the title in Shift-JIS format, it allows for 32 characters to be written 
0х44 - Ox5F 

Reserved(00h) 

This is used for the Pocketstation. 

0x60 - Ox7F 

Icon 16 Color Palette Data 

Frame 1 

Frame 3 

Icon Frame 

0x00 - Ox7F 

Icon Bitmap 

] Frame of animaton —- 1 Frame of data. 

If there is no Icon for this bock, it's data instead. 
Frame 4 

Data Frame 

40x00 - Ox7F 

Save Data 


Link Block 


0x00 - Ox7F 


Data Transmission 
Data is trasmitted with exactly the same protocol as the Pad data is trasmitted/revived. The pin out are 
exactly the same as well, the houseing, however is a different shape. 


{front view PSX) 


Serial 1/О 


The PSX has a 8 pin serial adapter that uses a non-standars protocol for data transmission and receiving. 
The pin outs are pictured here. 





12345678 





1 «-—- Carrier Detect (CD) 
2 Ground 

3 <---- Clear To Send (CTS) 
4 Data Terminal Ready —-—> 
5 Transmit ---> 

6 Ready To Send ----> 

7 3,3Vdc 

8 «-—- Receive 


The pot speed is able to go up to a maximum of 256K bps. Normally it's used at 56K. On connecton 
problems the port will attempt a reconnect, but may not fall back on a slower speed. The link cable is wired is such. 


^ 
| 


4 


| Z 
V VV V УУ OV 


01-4 00 or бу 


ЮМ 


| 


о 1 суол нь CO Мен 
УМ. Мм: „ы NAA 


The pins are like this (looking into the link cable connector looking into the pins of the cable connector) and the 
connector facing up) : 


CABLE 








/ / NN 
/ / NN 








/ UP N / UP N 








LEFT 112345678 | RIGHT LEFT | 1 2 3 4 5 6 7 8 | RIGHT 














Parallel I/O 


Overview 
The Parallel prt is is a sort of a faux name. It's really an expantion port. Any device connected to this port 
will have access to everything on the local bus. The address that the PIO port resides on is from Ox1f00 0000- 


052 


0х1 CÓ соо пя г 1 nin 


БМО 
SDATA 
CLK2 L^R, 44.1 kHz 


ОК, 2.8227 


“IWR 
А23 
А21 
ALS 
Al 7 
A15 
A13 
A11 


GND 





PSX-PIO2 


Appendix А 


Number systems 


The Hexadecimal system is as follows 


Decimal 
І a ae 


1 
2 


= 
© 


= 
= 
о 


кю ро! го 
че | л > 
— — 


N 
oo 
— 


N 


шо 
© 


чә 

чә 
ы 
о 


Бэ! 
a 


ю 
n 
ь 
= 
t 


| Decimal | 
| 
| 6 — | 
[| 8 | 
ИШИ 
[.- «10: O 
| 25 | 
| 26 | 
| 027. | 
| 28 — | 
| 029 | 
| 033 | 
P| 
[252 | 
[253 | 
[24 | 





E 


Appendix В 
BIOS calls 


1st column - Address to call 
2nd column - Value of $t1 when calling 
3rd column - Name of the function 


Arguments whenever needed are passed by $a0,1,2,3 and at $sp+0x10 when more 
than 4 arguments. 


0x00a0 - 0x0000 - int open(char *name , int mode) 

0х00а0 - 0x0001 - int Iseek(int fd , int offset , int whence) 
0х00а0 - 0x0002 - int read(int fd , char *buf , int nbytes) 
0x00a0 - 0x0003 - int write(int fd , char *buf , int nbytes) 
0х00а0 - 0x0004 - close(int fd) 

0x00a0 - 0х0005 - int ioctl(int fd , int cmd , int arg) 

0х00а0 - 0x0006 - exit() 

0x00a0 - 0x0007 - sys bO 39() 

0х00а0 - 0x0008 - char getc(int fd) 

0x00a0 - 0x0009 - putc(char c , int fd) 

0х00а0 - 0x000a - todigit 

0х00а0 - 0x000b - double atof(char *s) 

0х00а0 - 0x000c - long strtoul(char *s , char **ptr , int base) 
0x00a0 - 0x000d - unsigned long strtol(char *s , char **ptr , int base) 
0x00a0 - 0x000e - int abs(int val) 

0x00a0 - 0x000f - long labs(long lval) 

0x00a0 - 0x0010 - long atoi(char *s) 

0х00а0 - 0x0011 - int atol(char *s) 

0х00а0 - 0x0012 - atob 

0x00a0 - 0x0013 - int setjmp(jmp. buf *ctx) 

0x00a0 - 0x0014 - longjmp(jmp. buf *ctx , int value) 
0х00а0 - 0x0015 - char *strcat(char *dst , char *src) 
0х0040 - 0x0016 - char *strncat(char *dst , char *src , int n) 
0х00а0 - 0x0017 - int stremp(char *dst , char *src) 

0х00а0 - 0x0018 - int strncmp(char *dst , char *src , int n) 
0x00a0 - 0x0019 - char *strcpy(char *dst , char *src) 
0х00а0 - 0x001a - char *strncpy(char *dst , char *src , int n)) 
0х00а0 - 0x001b - int strlen(char *s) 

0х00а0 - 0х001с - int index(char *s , int c) 

0х00а0 - 0x001d - int rindex(char *s , int c) 

0х00а0 - 0x001e - char *strchr(char *с , int c) 

0х00а0 - 0x001f - char *strrchr(char *c , int c) 

0x00a0 - 0x0020 - char *strpbrk(char *dst , *src) 

0х00а0 - 0x0021 - int strspn(char *s , char *set) 

0х00а0 - 0x0022 - int strcspn(char *s , char *set) 

0х00а0 - 0x0023 - strtok(char *s , char *set) 

0х00а0 - 0x0024 - strstr(char *s , char *set) 

0x00a0 - 0x0025 - int toupper(int c) 

0x00a0 - 0x0026 - int tolower(int c) 

0x00a0 - 0x0027 - void bcopy(void *src , void *dst , int len) 
0x00a0 - 0x0028 - void bzero(void *ptr , int len) 

0x00a0 - 0x0029 - int bemp(void "рігі , void *ptr2 , int len) 


0x00a0 - 0x002a - memcpy(void *dst , void *src , int n) 
0х00а0 - 0x002b - memset(void *dst , char c , int n) 
0х00а0 - 0x002c - memmove(void *dst , void *src , int n) 
0х00а0 - 0x002d - memcmp(void *dst , void *src , int n) 
0х00а0 - 0x002e - memchr(void *s , int c , int n) 

0х00а0 - 0x002f - int rand() 

0x00a0 - 0x0030 - void srand(unsigned int seed) 

0x00a0 - 0x0031 - void qsort(void *base , int nel , int width , int (*cmp)*void *,void *)) 
0x00a0 - 0x0032 - double strtod(char *s , char *endptr) 
0x00a0 - 0x0033 - void *malloc(int size) 

0х00а0 - 0x0034 - free(void *buf) 

0x00a0 - 0x0035 - void *Isearch(void *key , void *base , int belp , int width , int (*cmp)(void * , void *)) 
0х0040 - 0x0036 - void *bsearch( void *key , void *base , int nel , int size , int (*cmp)(void * , void *)) 
0x00a0 - 0x0037 - void *calloc(int size , int n) 

0x00a0 - 0x0038 - void *realloc(void *buf , int n) 
0x00a0 - 0x0039 - InitHeap(void *block , int n) 

0х00а0 - 0x003a - ехі 

0х00а0 - 0x003b - char getchar(int fd) 

0х00а0 - 0x003c - putchar(char c , int fd) 

0х00а0 - 0x003d - char *gets(char *s) 

0х00а0 - 0x003e - puts(char *s) 

0х00а0 - 0x003f - printf(char *fmt , ...) 

0x00a0 - 0x0041 - LoadTest(char *name , ХЕ НОК *header) 
0х00а0 - 0x0042 - Load(char *name , ХЕ HDR *header) 
0x00a0 - 0x0043 - Exec(struct EXEC *header , int argc , char **argc) 
0х00а0 - 0x0044 - FlushCache() 

0х00а0 - 0x0045 - void InstallInterruptHandler() 

0x00a0 - 0x0046 - СРО dw 

0x00a0 - 0x0048 - int SetGPUStatus(int status) 

0x00a0 - 0х0049 - СРО суу 

0x00a0 - 0x004a - СРО суур (not sure) 

0x00a0 - 0x004d - int GetGPUStatus() 

0x00a0 - 0x0049 - GPU. sync 

0х00а0 - 0x0051 - LoadExec(char *name , int , int) 
0x00x0 - 0x0052 - GetSysSpO 

0x00a0 - 0x0054 - 96 init() 

0x00a0 - 0x0055 - bu init() 

0х00а0 - 0х0056 - 96 remove( 

0х00а0 - 0x0057 - return 0 (it only does this) 

0х00а0 - 0x0058 - return 0 (it only does this) 

0х00а0 - 0x0059 - return 0 (it only does this) 

0х00а0 - 0x005a - return 0 (it only does this) 

0x00a0 - 0x005b - dev tty init 

0x00a0 - 0х005с - dev. tty open 

0x00a0 - 0x005e - деу tty ioctl 

0x00a0 - 0x005f - dev cd open 

0х00а0 - 0x0060 - dev. cd read 

0х00а0 - 0x0061 - dev. cd close 

0x00a0 - 0x0062 - dev. cd firstfile 

0х00а0 - 0x0063 - dev. са nextfile 

0х00а0 - 0x0064 - dev. cd, chdir 

0x00a0 - 0x0065 - dev card open 

0х00а0 - 0x0066 - dev. card read 

0x00a0 - 0x0067 - dev. card. write 

0x00a0 - 0x0068 - dev. сага close 


0х00а0 - 0x0069 - dev. card firstfile 

0х00а0 - 0x006a - dev. сага nextfile 

0x00a0 - 0x006b - dev. сага erase 

0х00а0 - 0x006c - dev. card, undelete 

0x00a0 - 0x006d - dev. сага format 

0х00а0 - 0x006e - dev. сага rename 

0х00а0 - 0x0070 - Би init 

0х00а0 - 0x0071 - 96 init 

0х00а0 - 0x0072 - 96 remove 

0х00а0 - 0x0078 - 96 CdSeekL 

0х00а0 - 0x007c - 96 CdGetStatus 

0х00а0 - 0x007e - 96 CdRead 

0х00а0 - 0x0085 - 96 CdStop 

0x00a0 - 0x0096 - AddCDROMDevice() 

0х00а0 - 0x0097 - AddMemCardDevice() 

0x00a0 - 0x0098 - DisableKernelIORedirection() 

0x00a0 - 0x0099 - EnableKernelIORedirection() 

0х00а0 - 0x009c - GetConf(int Event , int TCB , int Stack) 

0х00а0 - 0x009d - GetConf(int *Event , int *TCB , int *Stack) 

0х00а0 - 0x009f - SetMem(int size) 

0х00а0 - 0х00а0 - boot 

0x00a0 - 0х00а1 - SystemError 

0x00a0 - 0x0022 - EnqueueCdIntr 

0x00a0 - 0x00a3 - DequeueCdIntr 

0х00а0 - 0х00а5 - ReadSector(count,sector,buffer) 

0x00a0 - 0x00a6 - get cd status 22 

0x00a0 - 0х00а7 - bufs cb 0 

0х00а0 - 0х00а8 - bufs cb 1 

0x00a0 - 0х00а9 - bufs cb 2 

0х00а0 - 0x00aa - биїз cb 3 

0х00а0 - 0x00ab - card info 

0х00а0 - 0х00ас - сага load 

0х00а0 - 0х00а4- сага auto 

0х00а0 - 0x00ae - bufs cb 4 

0x00a0 - 0x00b2 - do a long jmpO 

0х00а0 - 0x00b4 - (sub. function) 
0 - и long GetKernelDate (date is in OxY Y YYMMDD BCD format) 
1-u long GetKernel???? (returns 3 on cex1000 and cex3000) 
2 - char *GetKernelRomVersion() 

3-? 

4-? 

5-u long GetRamSize() (in bytes) 

6->F-?? 


0x00b0 - 0x0000 - SysMalloc (to malloc kernel memory) 
0х0050 - 0x0007 - DeliverEvent(class , event) 

0x00b0 - 0x0008 - OpenEvent(class , spec , mode , func) (source code is corrected) 
0х0050 - 0x0009 - CloseEvent(event) 

0х0050 - 0x000a - WaitEvent(event) 

0х0050 - 0x000b - TestEvent(event) 

0х0050 - 0x000c - EnableEvent(event) 

0х0050 - 0x000d - DisableEvent(event) 

0х0050 - 0x000e - OpenTh 

0х0050 - 0x000f - CloseTh 

0x00bO - 0x0010 - ChangeTh 


0x00b0 - 0x0012 - InitPad 

0x00b0 - 0x0013 - StartPad 

0х0050 - 0x0014 - StopPAD 

0x00b0 - 0x0015 - PAD init 

0x00b0 - 0x0016 - u long PAD dr() 

0x00bO - 0x0017 - ReturnFromException 

0x00bO - 0x0018 - ResetEntryInt 

0x00bO - 0x0019 - HookEntryInt 

0х0050 - 0x0020 - UnDeliverEvent(class , event) 
0x00b0 - 0x0032 - int open(char *name,int access) 
0x00bO - 0x0033 - int Iseek(int fd,long pos,int seektype) 
0x00b0 - 0x0034 - int read(int fd,void *buf,int nbytes) 
0x00b0 - 0x0035 - int write(int fd,void *buf,int nbytes) 
0х0050 - 0x0036 - close(int fd) 

0x00b0 - 0x0037 - int ioctl(int fd , int cmd , int arg) 
0х0050 - 0x0038 - exit(int exitcode) 

0x00b0 - 0x003a - char getc(int fd) 

0x00b0 - 0x003b - putc(int fd,char ch) 

0x00b0 - 0x003c - char getchar() 

0x00b0 - 0x003d - putchar(char ch) 

0x00b0 - 0x003e - char *gets(char *s) 

0x00b0 - 0x003f - puts(char *s) 

0x00b0 - 0x0040 - cd 

0х0050 - 0x0041 - format 

0х0050 - 0x0042 - firstfile 

0х0050 - 0x0043 - nextfile 

0х0050 - 0x0044 - rename 

0х0050 - 0x0045 - delete 

0х0050 - 0x0046 - undelete 

0x00b0 - 0x0047 - AddDevice (used by AddXXXDevice) 
0х0050 - 0x0048 - RemoveDevice 

0х0050 - 0x0049 - PrintInstalledDevices 

0x00b0 - 0x004a - InitCARD 

0x00b0 - 0x004b - StartCARD 

0x00b0 - 0x004c - StopCARD 

0х0050 - 0x004e - сага write 

0х0050 - 0х004Г- сага read 

0х0050 - 0x0050 -. new. сага 

0х0050 - 0x0051 - Krom2RawAdd 

0х0050 - 0x0054 - long _ get errno(void) 

0х0050 - 0x0055 - long get error(long fd) 
0x00b0 - 0x0056 - GetCOTable 

0x00b0 - 0x0057 - GetBOTable 

0х0050 - 0x0058 - card. chan 

0x00b0 - 0x005b - ChangeClearPad(int) 

0х0050 - 0х005с - card status 

0х0050 - 0х0054 - сага wait 


0х00с0 - 0x0000 - InitRCnt 

0х00с0 - 0x0001 - InitException 

0х00с0 - 0x0002 - SysEnqIntRP(int index , long *queue) 
0х00с0 - 0x0003 - SysDeqIntRP(int index , long *queue) 
0х00с0 - 0x0004 - get free EvCB. slot() 

0х00с0 - 0x0005 - get free ТСВ slot() 

0х00с0 - 0x0006 - ExceptionHandler 


0х00с0 - 0x0007 - InstallExceptionHandlers 

0х00с0 - 0x0008 - SysInitMemory 

0х00с0 - 0x0009 - SysInitKMem 

0х00с0 - 0x000a - ChangeClearRCnt 

0х00с0 - 0x000b - SystemError 222 

0х00с0 - 0x000c - InitDefInt 

0х00с0 - 0x0012 - InstallDevices 

0х00с0 - 0x0013 - FlushStdInOutPut 

0х00с0 - 0x0014 - return 0 

0х00с0 - 0x0015 - cdevinput 

0x00c0 - 0x0016 - cdevscan 

0x00c0 - 0x0017 - char _circgetc(struct device buf *circ) 
0х00с0 - 0x0018 - _circputc(char с, struct device buf *circ) 
0х00с0 - 0x0019 - ioabort(char *str) 

0х00с0 - 0x001b - KernelRedirect(int flag) 

0х00с0 - 0x001c - PatchAOTable 


There are 3 more i know that arent called the same way as above: 
MiPS R3000: 


Exception() { 
li 540,0 
syscall 

} 


EnterCriticalSection() { 
li $a0,1 
syscall 


) 


ExitCriticalSection() { 
li $a0,2 

syscall 

} 


Appendix С 


GPU command listing 
Overview of packet commands: 


0x01 clear cache 

0x02 frame buffer rectangle draw 
0x20 monochrome 3 point polygon 
0x24 textured 3 point polygon 
0x28 monochrome 4 point polygon 
Ox2c textured 4 point polygon 
0х30  gradated 3 point polygon 
0x34  gradated textured 3 point polygon 
0x38  gradated 4 point polygon 
Ox3c  gradated textured 4 point polygon 
0x40 | monochrome line 

0x48 | monochrome polyline 

0x50  gradated line 

0x58  gradated line polyline 

0x60 rectangle 

0x64 sprite 

0x68 dot 

0x70 8*8 rectangle 

0x74 8*8 sprite 

0x78 16*16 rectangle 

0х7с 16*16 sprite 

0x80 move image in frame buffer 
Oxa0 send image to frame buffer 
Охс0 copy image from frame buffer 
Oxel draw mode setting 

Oxe2 texture window setting 

Oxe3 set drawing area top left 

Oxe4 set drawing area bottom right 
Oxe5 drawing offset 

Oxe6 mask setting 


Appendix D 


Glossary of terms 


PSX 
SCEI 
SCEA 
SCEE 
GTE 
GPU 
CPU 
MDEC 
PIO 
SPU 
BIOS 


Playstation 

Sony Computer Entertainment Incorporated (Sony of Japan) 
Sony Computer Entertainment America (Sony of America) 
Sony Computer Entertainment Europe (Sony of Europe) 
Geometry Transformation Engine 

Graphics Processing Unit 

Central Processing Unit 

Motion DEcoding Chip 

Parallel Input/Output port 

Sound Processing Unit 

Basic Input/Output System 


Appendix E 


Works cited - Bibliography 
History of the Sony PlayStation taken from http://www.psxpower.com 


The IDTR3051 ТМ, R3052 тм RISController TM Hardware User's Manual Revision 1.4 July 15, 1994 
©1992, 1994 Integrated Device Technology, Inc. 


System.txt, cdinfo.txt, gpu.txt, spu.txt, gte.txt 
doomed (2c64.org http://psx.rules.org 


gte-lite.txt 
http://www.in-brb.de/~creature/ 


MDEC data from 

jlo € ludd.luth.se and various people at PSXDEV mailing list 
http://www.geocities.co.jp/Playtown/2004/ 

bero @ geocities.co.jp 


Memcard/PAD Data 
HFB03536 Gnifty-serve.or.jp 


PIO 
bitmaster @bigfoot.com 


Syscall 
sgf22(? cam.ac.uk 


Mem card format: E-nash 
http://www.vbug.or.jp/users/e-nash/ 
e-nash@i.am 


Plus the many more at PSXDEV mailing list that helped ^_^ 


Exitcode 84905 


